More or less, they use attention to learn the relevance of an ad to a user's past behaviors.
The Deep Interest Network (DIN) is a model developed by researchers at Alibaba. It was designed to address challenges in click-through rate (CTR) prediction, which is a key problem in online advertising systems. The main contributions and components of the DIN can be summarized as follows:
1. User Interest Modeling: The authors propose a novel way to model user interest by considering the user's historical behaviors. Traditional models, such as the Wide & Deep model, treat user behavior features independently, which doesn't capture diverse interests of users. DIN, on the other hand, designs an interest extractor layer to adaptively learn the representation of user interests from historical behaviors with respect to a certain ad.
2. Adaptive Activation Function: The authors introduce an activation function to model the diverse contributions of user interests to the prediction of different ads. This function is essentially an attention mechanism that provides a weight to each user's historical behavior according to its relevance to the candidate ad. The formula of the activation function is:
[ a(\mathbf{v}, \mathbf{t}) = \frac{\exp(\mathbf{v}^T\mathbf{t})}{\sum_{\mathbf{v'} \in \mathbf{V}} \exp(\mathbf{v'}^T\mathbf{t})} ]
where (\mathbf{v}) is a user's historical behavior embedding, (\mathbf{t}) is the target ad embedding, and (\mathbf{V}) is the set of all user behavior embeddings.
3. Architecture: The DIN model consists of an Embedding & Combination layer, an Interest Extractor layer, and a Stacking layer.
The Embedding & Combination layer is responsible for transforming the categorical input features into low-dimensional dense embeddings, and combining them with numerical input features.
The Interest Extractor layer uses the activation function to weigh user behavior embeddings, then sum them up to get the user's interest representation.
The Stacking layer is a traditional feed-forward neural network, which takes the combined features from the first layer and user interest representation from the second layer to make the final prediction.
4. Experimental Results: The authors reported that DIN significantly outperformed traditional models on a large-scale dataset from the Alibaba display advertising system.
Implications of the paper are profound in the field of online advertising. The DIN model's novel approach to capturing user interest and using attention mechanisms to weigh these interests is a significant contribution to the field of CTR prediction. This approach allows for more personalized and accurate ad recommendations, potentially leading to increased user engagement and revenue for online advertising platforms.