Matrix Factorization Part III: Production phase

Going over simple paper implementation, into what we need to do to be production ready.

Photo by Haneul Kim

In previous articles we successfully trained user and item embedding using Matrix Factorization and tested it on our training dataset. We simply predicted scores for (user, item) pairs that exists however we are building a recommender system meaning that given an user n items must be recommended.

Today we will write code to recommend n items given an user and also handle cold-start user. Cold-start problem refer to users and items that have little or no previous data that enter our service, this is common problem in recommender systems and various researches aim to recommend good items to cold-users as well. Some remedies include recommending most popular items, similar items, and using models that generalize well.

Now, let’s recommend some movies to users :)

For each user in test dataset, we will recommend top3 relevant movies. For cold-start users we will recommend most popular top3 movies.

line 14~16, 18, 20 are cold-users and others are warm-users.

When warm-user(not cold-start) enters our service, look-up trained user embedding and pass it to get_topK() to get recommendations.

Now we will see what happens when cold-user enters our service. We’ve added self.mp_df_sorted which gets created when we fit our model on training dataset, it is basically a dataframe that represents item popularity.

When cold-user enters the service we simply look-up movieId from above table. It is sorted by highest average rating and number of ratings its gotten.

When serving recommender engine in productions here are few things to that I always consider:

Possible improvements:


Even though it is important to focus on recommender model and its performance metrics, if you look at user and your company’s perspective there are much more important metrics that should be considered. It’s important for Data Scientists to understand these and apply them therefore both your company and users will be satisfied. What I’ve learned in past few years is that it’s easy to get stuck in building a more complex model that slightly increase performance, you must always step-back and think about users and company’s future.



Data Scientist passionate about helping the environment.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store