Recommendation Algorithms

TLDR

In practice, my guess is most modern approaches at companies with a lot of users use a neural-network based hybrid approach. That is, predict engagement with an item {where this is defined over multiple engagement types, e.g. likes / shares / saves / follow creator / favorite creator / purchases / what-have-you, perhaps with auxiliary losses as well to help learning} based on hybrid information: things like past user interactions {maybe a big embedding}, video information {likely a big embedding}, and device and account settings {maybe a smaller embedding}.

When it comes to neural-network based approaches, it seems like this could be defined as just a box-standard feed-forward prediction DNN where new data is periodically (e.g. data} fed back into the model to fine-tune & aid accuracy. Or you could potentially frame it as a reinforcement learning algorithm, too. Not sure how popular each of these approaches are.

A feed-forward net is simpler, but may have a harder time processing a the sequential nature of a user's behavior {particularly depending how the features are structured}. Like, you could weight more recent video statistics and embeddings higher, or add a time dimension to these features, but you'd have to build that into the features themselves.

Alternatively, RL-based recommender systems have, it sounds like, been increasingly used due to their ability to handle sequential decision making and focus on long-term rewards and engagement.

In an RL-based recommender system, the model {or "agent"} learns to make recommendations {or "actions"} that maximize some long-term reward signal, such as cumulative user engagement over time. The agent learns from feedback {or "rewards"} it receives after making recommendations, and it updates its strategy {or "policy"} based on this feedback.

RL-based recommender systems have the potential to outperform traditional methods in situations where long-term user engagement is important. For example, an RL agent could learn to recommend a diverse set of items to keep users interested in the long run, even if these items might not be the ones with the highest predicted engagement in the short term.

However, RL also comes with its own set of challenges, such as the difficulty of defining a suitable reward function, the need for exploration vs. exploitation, and the complexity of training RL models.

Background Context

Historically, it seems like recommendation algorithms have fallen into these categories:

Collaborative Filtering (CF): This approach makes recommendations based on patterns of user behavior. Collaborative filtering can be further divided into two sub-categories:
- User-based Collaborative Filtering (User-User CF): This method finds users similar to the target user based on their rating history. The idea is that if two users agree on one issue, they are likely to agree on others as well. The similarity between users can be calculated using methods like Pearson correlation or cosine similarity. Once similar users are found, the ratings they've given to other items can be used to recommend items to the target user.
- Item-based Collaborative Filtering (Item-Item CF): This method, on the other hand, calculates the similarity between items based on the ratings they've received from users. If two items are often rated similarly by users, then they are considered to be similar. Once similar items are found, they can be recommended to users who have rated one of the items in the pair.
Content-Based Filtering (CBF): This approach makes recommendations based on the characteristics of items. For example, if a user has positively rated several action movies in the past, a content-based recommender might suggest more action movies for them to watch. This requires having some sort of descriptive profile for each item in the dataset, which could be based on manually assigned tags, machine learning algorithms that analyze text descriptions, or other methods.
Hybrid Methods: As the name suggests, hybrid methods combine collaborative and content-based filtering in various ways to make recommendations. The idea here is that by combining the strengths of both methods, you can achieve better performance than with either method alone. One simple way to create a hybrid recommender is to generate recommendations with both a CF and a CBF algorithm, then combine the results in some way, such as by taking a weighted average.

NoiseDive

Recommendation Algorithms

TLDR

Background Context

👁️ 1079

hills

20:41

30.06.23