How Does Collaborative Filtering Work in Recommender Systems?

Jul 29, 2022•7 min read

Languages, frameworks, tools, and trends

Collaborative filtering recommender systems have played a significant role in the rise of web services and content platforms like Amazon, Netflix, YouTube, etc. in recent years. In this age of information, knowing what the customer wants before they even know it themselves is nothing short of a superpower. As the name suggests, recommender system algorithms are used to offer relevant content or product to the consumer based on their taste or previous choices. In this article, we will look at how a particular type of recommender system works: collaborative recommender system.

Collaborative filtering in recommender system

Collaborative filtering vs content-based filtering for recommender system.webp

There are two types of recommender systems, content-based filtering and collaborative filtering. Content-based filtering uses machine learning algorithms to predict and recommend new, yet similar, items to users. It uses item features to group similar items together.

Collaborative filtering solely uses past interactions between the customers and the products they’ve used to recommend new items. Item features are not important since user-item interactions are used and are stored in the user-item interactions matrix.

In collaborative filtering, all the users are taken into consideration and people with similar tastes and preferences are used to suggest new and specific products to the primary customer. It helps companies and customers keep up with what’s trending.

Two types of interactions between users and products are recorded:

The first is through direct searches and implicit feedback actions such as clicks, order history, playing of certain content, etc.
The second is through direct feedback from users. For example, rating a movie they’ve watched on a scale of 1 to 5 stars, liking or disliking a YouTube video, ‘starring’ an album or playlist on Spotify to mark it as a favorite, and so on.

Here’s an example for a clearer idea:

It's the weekend and you have nothing to do so you decide to watch a new movie. You plan to make an evening out of it and invite your friend over who has a similar taste in movies. He brings the new Doctor Strange movie since he knows you like sci-fi action even though you haven’t actually watched a superhero movie before. You end up loving Doctor Strange and look forward to watching the other Marvel movies to know the whole story.

All this was possible because you trusted that your friend knows your taste and would suggest something you had a high probability of liking. Collaborative filtering algorithms work in much the same way and suggest new content and products based on the behavior of similar customers.

Why do we need recommender systems?

Advantages of recommender systems.webp

Back in 2006, Netflix offered a prize to solve a simple problem that had been around for years. It was to find the best collaborative algorithm to predict user ratings for films that they haven't watched yet, based on previous ratings of other movies.

Today, e-commerce giants continue to try to solve this problem in a better way by observing users’ past behavior to predict what other things the same user will like. Why? Because knowing what to offer in advance can boost their bottom line by increasing sales and enhancing customer experience.

Recommendations also help customers discover new products and offers that they’re not explicitly looking for, thus speeding up the search process. This allows companies to send out personalized newsletters via email that offer new TV shows, movies, products, and services that are better suited for them.

One of the most significant advantages of modern recommendation algorithms is their ability to take implicit feedback and suggest new content/products, thus staying up-to-date with customers’ preferences. This enables businesses to continue catering to customers even if their tastes change over time.

User-item interaction matrix

In collaborative filtering, we ignore the features of an individual item. Instead, we focus on a similar group of people using the item and recommend other items that the group likes.

Similar users are divided into small clusters and are recommended new items according to the preferences of that cluster. Let’s understand this with an easy movie recommendation example:

Collaborative filtering in recommender system.webp

Image source: GeeksforGeeks

What we can infer from this user-item matrix is:

Users 1 and 2 liked Movie 1. Since User 1 liked movies 2 and 4 a lot, there’s a high chance of User 2 enjoying the same.
Users 1 and 3 have opposite tastes.
Users 3 and 4 both disliked Movie 2, so there’s a high chance User 4 will also dislike Movie 4.
User 3 might dislike Movie 1.

This is the logic behind employing a user-item interaction matrix - to find clusters of similar users through collaborative filtering.

Types of collaborative filtering

The two types of collaborative filtering approaches are:

Memory-based collaborative approach
Model-based collaborative approach

Types of collaborative filtering.webp

Image source: Iterators

Memory-based collaborative approach

In memory-based collaborative filtering, only the user-item interaction matrix is utilized to make new recommendations to users. The whole process is based on the users’ previous ratings and interactions.

Memory-based filtering consists of 2 methods: user-based collaborative filtering and item-based collaborative filtering.

User-based collaborative filtering

To suggest new recommendations to a particular user, a group of similar users (nearest neighbors) is created based on the interactions of the reference user. The items that are most popular in this group, but new to the target user, are used for the suggestions.

User-based collaborative filtering.webp

Image source: Towards Data Science

Item-based collaborative filtering

In item-based filtering, new recommendations are selected based on the old interactions of the target user. First, all the items that the user has already liked are considered. Then, similar products are computed and clusters are made (nearest neighbors). New items from these clusters are suggested to the user.

Item-based collaborative filtering.webp

Model-based collaborative approach

In the model-based approach, machine learning models are used to predict and rank interactions between users and the items they haven’t interacted with yet. These models are trained using the interaction information already available from the interaction matrix by deploying different algorithms like matrix factorization, deep learning, clustering, etc.

Matrix factorization

Matrix factorization is used to generate latent features by decomposing the sparse user-item interaction matrix into two smaller and dense matrices of user and item entities.

Going back to our movie example, let’s assume we have a sparse matrix of 4 users and 4 movies, the ratings ranging from 1 to 5:

Since not all the movies are viewed and rated by every user, we end up with a sparse matrix. To create a model for our matrix, we can assume that:

There exists some latent features that can differentiate between good and bad movies.
These features can help us understand user choices (higher the value, higher the preference).

We do not provide these features explicitly, but let the model discover the useful features and make its user and item matrices. As the features are learned and not provided, they have mathematical correlation and meaning but no intuitive understanding.

Matrix factorization.webp

Collaborative filtering: Advantages and disadvantages

Advantages

No domain knowledge is required since all the features are learned automatically.
Can help users discover new interests even if they’re not actively searching for them by recommending new items similar to what they’re interested in.
Does not require in-detail features and contextual data of products or items. It only needs the user-item interaction matrix to train the matrix factorization model.

Disadvantages

Data sparsity can lead to difficulty in recommending new products or users since the suggestions are based on historic data and interactions.
As the user base grows, the algorithms suffer due to high data volume and lack of scalability.
Lack of diversity in the long run. This might seem counterintuitive since the whole point of collaborative filtering is to recommend new items to the user. However, since the algorithms function based on historical ratings, it will not recommend items with little or limited data. Popular products will be more popular in the long run and there will be a lack of new and diverse options.

In this article, we discussed the collaborative filtering approach to recommender systems, and how it leverages the user-item interaction matrix to make suggestions. We also discussed the different types of collaborative filtering, namely memory-based and model-based approaches. The memory-based approach can be further divided into user-based and item-based collaborative filtering, depending on whether the user or the item was considered as central for giving suggestions. We then looked at the different pros and cons of using collaborative filtering, and why it’s one of the best choices for recommendation systems.

Author
Turing Staff