After watching Udemy online course Building Recommender Systems with Machine Learning and AI, I came up with the idea to write a text that can help beginners to understand the basic ideas of the recommender systems.
A recommender system, or a recommendation system is a subclass of information filtering system that seeks to predict the “rating” or “preference” a user would give to an item.
In the last decade companies have invested a lot of money in their development. Netflix awarded a $1 million prize to a developer team in 2009 for an algorithm that increased the accuracy of the company’s recommendation engine by 10 percent.
There are two main types of recommender systems – personalized and non-personalized.

Non-personalized recommendation systems like popularity based recommenders recommend the most popular items to the users, for instance top-10 movies, top selling books, the most frequently purchased products.
What is a good recommendation?
- The one that is personalized (relevant to that user)
- The one that is diverse (includes different user interests)
- The one that doesn’t recommend the same items to users for the second time
- The one that recommends available products
Personalized
Personalized recommender system analyzes users data, their purchases, rating and their relationships with other users in more detail. In that way every user will get customized recommendations.
The most popular types of personalized recommendation systems are content based and collaborative filtering.
Content based
Content based recommender systems use items or users metadata to create specific recommendations. The user’s purchase history is observed. For example if a user has already read a book from one author or bought a product of a certain brand it is assumed that the customer has a preference for that author or that brand and there is a probability that user will buy a similar product in the future. Assume that Jenny loves sci-fi books and her favourite writer is Walter Jon Williams. If she reads the Aristoi book, then her recommended book will be Angel station, also sci-fi book written by Walter Jon Williams.

Collaborative filtering in practice gives better results then content based approach. Perhaps it is because there is not as much diversity in the results as in collaborative filtering.
Disadvantages of content based approach:
- There is a so-called phenomenon filter bubble. If a user reads a book about a political ideology and books related to that ideology are recommended to him he will be in the “bubble of his previous interests”.
- lot of data about user and his preferences needs to be collected to get the best recommendation
- in practice there are 20% of items that attract the attention of 70-80% of users and 70-80% of items that attract the attention of 20% of users. Recommender’s goal is to introduce other products that are not available to users at first glance. In a content based approach this goal is not achieved as well as in collaborative filtering.
Collaborative filtering
The idea of collaborative filtering is simple: User group behavior is used to make recommendations to other users. Since the recommendation is based on the preferences of other users it is called collaborative.
There are two types of collaborative filtering: memory-based and model based.
Memory based
Memory based techniques are applied to raw data without preprocessing. They are easy for implementation and the resulting recommendations are generally easy to explain. Each time it is necessary to make predictions over all the data which slows down the recommender.
There are two types: user based and item based collaborative filtering.
- User based – “Users who are similar to you also liked…” Products are recommended to the user based on the fact that they were purchased / liked by users who are similar to the observed user. If we say that users are similar what does that mean? For example, Jenny and Tom love sci-fi books. When a new sci-fi book appears and Jenny buys that book, since Tom also likes sci-fi books then we can recommend the book that Jenny bought.

- Item based – “Users who liked this item also liked…” If John, Robert and Jenny highly rated sci-fi books Fahrenheit 451 and The time machine, for example gave 5 stars, then when Tom buys the book Fahrenheit 451 then the book The time machine is also recommended to him because the system identified books as similar based on user ratings.

How to calculate user-user and item-item similarities?
Unlike the content based approach where metadata about users or items is used, the collaborative filtering memory based approach user behavior is observed, e.g. whether the user liked or rated an item or whether the item was liked or rated by a certain user.
For example, the idea is to recommend Robert the new sci-fi book.
Steps:
- Create user-item-rating matrix
- Create user-user similarity matrix
- Cosine similarity is calculated (alternatives: adjusted cosine similarity, pearson similarity, spearman rank correlation) between every two users. In this way a user-user matrix is obtained. This matrix is smaller than the initial user-item-rating matrix.
- Cosine similarity is calculated (alternatives: adjusted cosine similarity, pearson similarity, spearman rank correlation) between every two users. In this way a user-user matrix is obtained. This matrix is smaller than the initial user-item-rating matrix.
- Look up similar users
- In the user-user matrix, users that are most similar to Robert are observed
- Candidate generation
- When Robert’s most similar users are found, then we look at all the books these users read and ratings they gave.
- Candidate scoring
- Depending on the ratings, books are ranked from the ones that Robert’s most similar users liked the most, to the ones they liked the least.
- The results are normalized ( on a scale from 0 to 1)
- Candidate filtering
- It is being checked whether Robert has already bought any of these books. Those books should be eliminated because he has already read it.
The calculation of item-item similarity is done in an identical way and has all the same steps as user-user similarity.
Comparison of user-based and item-based approaches
The similarity between items is more stable than the similarity between the users because the math book will always be a math book, but the user can change his mind, e.g. something he liked last week he might not like next week. Another advantage is that there are fewer products than users. This leads to the conclusion that an item-item matrix with similarity scores will be smaller than a user-user matrix. Also item-based is a better approach if a new user visits the site while the user-based approach is problematic in that case.
Model based
These models were developed using machine learning algorithms. A model is created and based on it, not all data, gives recommendations, which speeds up the work of the system. This approach achieves better scalability. Dimensionality reduction is often used in this approach. The most famous type of this approach is matrix factorization.
Matrix factorization
If there is feedback from the user for example, a user has watched a particular movie or read a particular book and has given a rating, that can be represented in the form of a matrix where each row represents a particular user and each column represents a particular item. Since it is almost impossible that the user will rate every item, this matrix will have many unfilled values. This is called sparsity. Matrix factorization methods are used to find a set of latent factors and determine user preferences using these factors. Latent Information can be reported by analyzing user behavior. The latent factors are otherwise called as features.
Why factorization?
Rating matrix is a product of two smaller matrices – item-feature matrix and user-feature matrix.

Matrix factorization steps:
- Initialization of random user and item matrix
- Ratings matrix is obtained by multiplying the user and the transposed item matrix
- The goal of matrix factorization is to minimize the loss function (the difference in the ratings of the predicted and actual matrices must be minimal). Each rating can be described as a dot product of row in user matrix and column in item matrix.

Where K is a set of (u,i) pairs, r(u,i) is the rating for item i by user u and λ is a regularization term (used to avoid overfitting).
- In order to minimize loss function we can apply Stochastic Gradient Descent (SGD) or Alternating Least Squares (ALS) . Both methods can be used to incrementally update the model as new rating comes in. SGD is faster and more accurate than ALS.
Hybrid recommenders
They represent a combination of different recommenders. The assumption is that a combination of several different recommenders will give better results than a single algorithm.
Recommender systems metrics
Which metric will be used depends on the business problem being solved. If we think that we have made the best possible recommender and the metric is great, but in practice it is bad, then our recommender is not good. Netflix recommender was never used in practice because it did not meet customer needs. The most important thing is that the user gains confidence in the recommender system. If we recommend him the top 10 products, and only 2 or 3 are relevant to him, he will consider that the recommender system is bad. For this reason, the idea is not to always recommend top 10 items, but to recommend items above a certain threshold.
Metrics:
- Acuracy ( MAE, RMSE)
- Measure top -N recommenders:
- Hit rate – First find all items in this user’s history in the training data; remove one of these items ( leave-one-out cross-validation); use all other items for recommender and find top 10 recommendations; If the removed item appear in the top 10 recommendations, it is a hit. If not, it’s not a hit.
- average reciprocal hit rate (ARHR) – we get more credit for recommending an item in which user rated on the top of the rank than on the bottom of the rank.
- cumulative hit rate – those ratings that are less than a certain threshold are rejected, e.g. ratings less than 4
- rating hit rate – rating score for each rating is calculated in order to find which type of rating is getting more hits. Sum the number of hits for each type of rating in top-N list and divide by the total number of items of each rating in top-N list.
- Online A/B testing – A/B testing is the best way to do online evaluation of your recommender system.
Recommender real world challenges
- Cold start problem – a new user has appeared, what to recommend?
-
- For example top 10 best selling products
-
- Top 10 products on promotion
- The user can be interviewed to find out what he likes
- A new user has appeared, how can a new product be recognized by the recommender?
-
- Use content-based attributes
- Randomly add new products to user recommendations
- Promote new products
- Churn
-
- Since user changes behavior over time, a certain dose of randomization should be part of recommender systems in order to refresh top N list of recommended items
- Be careful not to offend anyone with the recommender
- Be careful not to make discrimination of any kind
- Avoid recommending items that contain vulgar words, religious and political topics or drugs
If you have any questions regarding this topic, or want to share some impressions – drop us an email. We will be happy to discuss this on a more detailed level! 🙂