Pushing the Limits of the Two-Tower Model | by Samuel Flender | Dec, 2023

December 10, 2023
by Samuel Flender
AI, Syndicated
286 Views

Where the assumptions behind the two-tower model architecture break — and how to go beyond

(Image created by the author using generative AI)

Two-tower models are among the most common architectural design choices in modern recommender systems — the key idea is to have one tower that learns relevance, and a second, shallow, tower that learns observational biases such as position bias.

In this post, we’ll take a closer look at two assumptions behind two-tower models, in particular:

the factorization assumption, i.e. the hypothesis that we can simply multiply the probabilities computed by the two towers (or add their logits), and
the positional independence assumption, i.e. the hypothesis that the only variable that determines position bias is the position of the item itself, and not the context in which it is impressed.

We’ll see where both of these assumptions break, and how to go beyond these limitations with newer algorithms such as the MixEM model, the Dot Product model, and XPA.

Let’s start with a very brief reminder.

Two-tower models: the story so far

The primary learning objective for the ranking models in recommender systems is relevance: we want the model to predict the best possible piece of content given the context. Here, context simply means everything that we’ve learned about the user, for example from their previous engagement or search histories, depending on the application.

However, ranking models usually exhibit certain observation biases, that is, the tendency for users to engage more or less with an impression depending on how it was presented to them. The most prominent observation bias is position bias — the tendency of users to engage more with items that are shown first.

The key idea in two-tower models is to train two “towers”, that is, neural networks, in parallel, the main tower for learning relevance, and…

Source link

Pushing the Limits of the Two-Tower Model | by Samuel Flender | Dec, 2023

Where the assumptions behind the two-tower model architecture break — and how to go beyond

Two-tower models: the story so far

About Us

Our Services

Latest QSOL IT News

Pushing the Limits of the Two-Tower Model | by Samuel Flender | Dec, 2023

Where the assumptions behind the two-tower model architecture break — and how to go beyond

Two-tower models: the story so far

Related Post

Automated incident response – efficiency at scale

Major 2025 spike in cloud spending bodes well

Exploring Microsoft Entra Private Access

2025’s Top Biotech Cybersecurity Threats & Risk Reduction