Back to Writing

Understanding User Affinity

Technology

When building recommendation systems or analyzing user behavior, a common question arises: given users interacts with product A, which other products are they most likely to interact with? This seems pretty straightforward, but when digging deeper, we find that naive approaches fail in different ways.

Approach 1: Raw Co-viewing Counts

Lets take a look at an example. In this case, say we are YouTube, and we want to find out which channels a user is most likely to watch if they watch a specific channel.

The simplest approach is to see how many users watch both channels. For each candidate channel X, we can compute the intersection of viewers:

Affinity(AX)=UAUX\text{Affinity}(A \to X) = |U_A \cap U_X|
(1)

Where UAU_A is the set of users who watch seed Channel A and UXU_X is the set who watch target Channel X.

Raw Count: Gaming Channel vs MrBeast

Total Population: 10,000,000 users
10,00040,0007,960,000Gaming Channel(50,000 total)MrBeast(8,000,000 total)
High overlap (40K)

In this example, the affinity score for MrBeast is 40,000. This seems to informative! But if we try to change the seed channel to a different channel, we start to find that the top channels are the same.

Sure 80% of users who watches the gaming channel also watches MrBeast, but 80% of YouTube users already watches MrBeast. This basically tells us nothing more than the viewers are just as likely to watch MrBeast as the average user.

Approach 2: Lift Score

Well then how do we determine if the user behavior is out of the norm or not? We don't just want to know how many viewers of channel A views channel X, we want to see how many viewers of channel A views channel X compared to the general population that views channel X. In this case, we can use the lift score. Lift measures how much more likely users are to watch target Channel X given that they watch seed Channel A, BUT compared to the baseline:

Lift(AX)=P(XA)P(X)=UAUX/UAUX/N=UAUXNUAUX\text{Lift}(A \to X) = \frac{P(X|A)}{P(X)} = \frac{|U_A \cap U_X| / |U_A|}{|U_X| / N} = \frac{|U_A \cap U_X| \cdot N}{|U_A| \cdot |U_X|}
(2)

Where NN is the total number of users. A lift of 1 means no association; higher values indicate positive correlation.

Notice that when ranking different target channels X for a fixed seed channel A, both NN (total population) and UA|U_A| (seed channel size) are constants. This means the lift formula simplifies to:

Lift(AX)UAUXUX\text{Lift}(A \to X) \propto \frac{|U_A \cap U_X|}{|U_X|}

In other words, lift is just the overlap normalized by the target channel's size.

Lift Score: Gaming Channel vs Similar Creator

Total Population: 10,000,000 users
41,5008,500111,500Gaming Channel(50,000 total)Similar Creator(120,000 total)

The example above shows the lift calculation for the Gaming Channel vs Similar Creator: 8500/50000120000/10000000=14.2\frac{8500/50000}{120000/10000000} = 14.2. Users of the gaming channel are 14.2x more likely to watch this similar creator than the average user. This is a much more informative score.

But there is another issue. When we rank the channels by lift score, we find that the top channels are all obscure channels with very few viewers.

Lift Score: Gaming Channel vs Obscure Reviewer

Total Population: 10,000,000 users
49,98515185Gaming Channel(50,000 total)Obscure Reviewer(200 total)
Tiny overlap (15) but lift score of 15x

Approach 3: Log-adjusted Lift

The solution is to weight the lift score by the log of the overlap size. This rewards channels that have both high lift and meaningful sample size:

Score(AX)=Lift(AX)×ln(UAUX+1)\text{Score}(A \to X) = \text{Lift}(A \to X) \times \ln(|U_A \cap U_X| + 1)
(3)

The natural logarithm compresses large numbers while still rewarding larger overlaps. The +1 prevents undefined behavior when overlap is zero.

Balanced Score: Gaming Channel vs Similar Creator

Total Population: 10,000,000 users
41,5008,500111,500Gaming Channel(50,000 total)Similar Creator(120,000 total)
Meaningful overlap (8.5K) with lift of 14x, balanced score of 127

Here the lift is 8500/50000120000/1000000014.2\frac{8500/50000}{120000/10000000} \approx 14.2, and the balanced score is 14.2×ln(8501)12814.2 \times \ln(8501) \approx 128. Compare this to the obscure reviewer with lift of 15: 15×ln(16)4215 \times \ln(16) \approx 42. The similar creator now ranks higher despite having a slightly lower lift, because the log term properly rewards the meaningful sample size.

Try adjusting the values below to see how different channel sizes and overlaps affect the scores:

Interactive Calculator

50K users
120K users
8.5K users (max: 50K)
10M users
Total Population: 10,000,000 users
41,5008,500111,500Seed Channel(50,000 total)Target Channel(120,000 total)
Raw Overlap
8.5K
Lift Score
14.17x
Log-adjusted Score
128.2

Lift = (8.5K/50K) / (120K/10M) = 14.17

Log-adjusted = 14.17 × ln(8501) = 128.2

Minimum Overlap Threshold

A quick note: In practice, we often have to apply a minimum overlap threshold (e.g., 100 users) before scoring, which filters out statistically unreliable results. For example, heres what happens if the obscure reviewer only has 5 viewers:

Lift Score: Gaming Channel vs Obscure Reviewer

Total Population: 10,000,000 users
49,99732Gaming Channel(50,000 total)Obscure Reviewer(5 total)
Tiny overlap (3) but lift score of 200x

There is a good chance that maybe 3 of the 5 viewers watch the medium sized gaming channel, but this gives the obscure reviewer a lift score of 666x.

Conclusion

Analyzing user affinity requires balancing competing signals. Raw counts favor popularity. Lift scores favor obscurity. The log-adjusted approach finds the middle ground: channels that are both meaningfully related and have enough data to be confident in the signal.