Skip to content

A recommender system for predicting the "rating" or "preference" of a user choice

License

Notifications You must be signed in to change notification settings

AmineFouzai/Movies_Recommendation_System

Repository files navigation

Movies_Recommendation_System

Build A Simple Recommender System

a recommender system, or a recommendation system (sometimes replacing 'system' with a synonym such as platform or engine), is a subclass of information filtering system that seeks to predict the "rating" or "preference" a user would give to an item. They are primarily used in commercial applications

DATA Preprocessing

import numpy as np 
import pandas as pd
ratings_data=pd.read_csv(r'C:\Users\TTos\Desktop\dataset\ml-latest-small\ml-latest-small\ratings.csv')
ratings_data.head()
userId movieId rating timestamp
0 1 1 4.0 964982703
1 1 3 4.0 964981247
2 1 6 4.0 964982224
3 1 47 5.0 964983815
4 1 50 5.0 964982931

Movies.csv

movies_names=pd.read_csv(r'C:\Users\TTos\Desktop\dataset\ml-latest-small\ml-latest-small\movies.csv')
movies_names.head()
movieId title genres
0 1 Toy Story (1995) Adventure|Animation|Children|Comedy|Fantasy
1 2 Jumanji (1995) Adventure|Children|Fantasy
2 3 Grumpier Old Men (1995) Comedy|Romance
3 4 Waiting to Exhale (1995) Comedy|Drama|Romance
4 5 Father of the Bride Part II (1995) Comedy

Merging Data from Rtings.csv & Movies.csv

movies_data=pd.merge(ratings_data,movies_names,on="movieId")
movies_data.head()
userId movieId rating timestamp title genres
0 1 1 4.0 964982703 Toy Story (1995) Adventure|Animation|Children|Comedy|Fantasy
1 5 1 4.0 847434962 Toy Story (1995) Adventure|Animation|Children|Comedy|Fantasy
2 7 1 4.5 1106635946 Toy Story (1995) Adventure|Animation|Children|Comedy|Fantasy
3 15 1 2.5 1510577970 Toy Story (1995) Adventure|Animation|Children|Comedy|Fantasy
4 17 1 4.5 1305696483 Toy Story (1995) Adventure|Animation|Children|Comedy|Fantasy

Groub by Rating & Title

movies_data.groupby('title')['rating'].mean().head()
title
'71 (2014)                                 4.0
'Hellboy': The Seeds of Creation (2004)    4.0
'Round Midnight (1986)                     3.5
'Salem's Lot (2004)                        5.0
'Til There Was You (1997)                  4.0
Name: rating, dtype: float64

Sort the Average Ratings

movies_data.groupby('title')['rating'].mean().sort_values(ascending=False).head()
title
Karlson Returns (1970)                           5.0
Winter in Prostokvashino (1984)                  5.0
My Love (2006)                                   5.0
Sorority House Massacre II (1990)                5.0
Winnie the Pooh and the Day of Concern (1972)    5.0
Name: rating, dtype: float64

Average Ratings with number of Ratings

ratings_mean_count=pd.DataFrame(movies_data.groupby('title')['rating'].mean())
ratings_mean_count['rating_count']=pd.DataFrame(movies_data.groupby('title')['rating'].count())
ratings_mean_count.head()
rating rating_count
title
'71 (2014) 4.0 1
'Hellboy': The Seeds of Creation (2004) 4.0 1
'Round Midnight (1986) 3.5 2
'Salem's Lot (2004) 5.0 1
'Til There Was You (1997) 4.0 2

Correlation System

Correlation is a statistical technique that can show whether and how strongly pairs of variables are related. For example, height and weight are related; taller people tend to be heavier than shorter people. The relationship isn't perfect

Extract Rating_count and Rating

x=[ float(_) for _ in  ratings_mean_count['rating_count']]
y=[ _ for _ in  ratings_mean_count['rating']]
x=pd.Series(x)
y=pd.Series(y)
Pearson’s r Value Correlation Between x and y
equal to 1 perfect positive linear relationship
greater than 0 positive correlation
equal to 0 independent
less than 0 negative correlation
equal to -1 perfect negative linear relationship
r=x.corr(y)
r2=x.corr(y, method='spearman')
r3=x.corr(y, method='kendall')
from pprint import pprint
pprint(
    {
       "Pearson's r":r,
       "Spearman's rho":r2,
       "Kendall's tau":r3
    })
{"Kendall's tau": 0.037132866375530676,
 "Pearson's r": 0.12730726667013137,
 "Spearman's rho": 0.0397780088264808}

Recommander function

corrChek=lambda r:False if r <0 else True 
from random import randint
def Recommander():
    if any([corrChek(r),corrChek(r2),corrChek(r3)]) is True:
        return ratings_mean_count['rating'].head()
    else:
        return  ratings_mean_count['rating'][::-randint(0,len(ratings_mean_count['rating']))]

Seek Recommanded Moives

Recommander()
title
'71 (2014)                                 4.0
'Hellboy': The Seeds of Creation (2004)    4.0
'Round Midnight (1986)                     3.5
'Salem's Lot (2004)                        5.0
'Til There Was You (1997)                  4.0
Name: rating, dtype: float64

About

A recommender system for predicting the "rating" or "preference" of a user choice

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published