Movie Rating Model

Vicky_Das · May 17, 2020, 11:17am

Hi All,

I'm very new to R and trying to work on a project on predicting movie ratings. I was able to complete the EDA's but struggling to understand, how to work on a model for predicting the ratings for each user . Can somebody please explain? (Data set has users in rows and movies in columns and ratings accordingly)

  m1. m2  m3   m4

u1 2. 4. NA. NA
u2 3. 5. NA. 1
u3 NA. 1. 2. 2
u4. NA. NA. 3. NA

DavoWW · May 18, 2020, 10:13am

Hi @Vicky_Das,
Welcome to the RStudio Community Forum.

Here is as reproducible example showing how to get your sample data into a dataframe, to start calculating simple statistics, and thinking about modelling options:

a <- "
user m1 m2 m3 m4
u1    2  4 NA NA
u2    3  5 NA  1
u3   NA  1  2  2
u4   NA NA  3 NA
"

dat <- read.table(text=a, header=TRUE)
dat
#>   user m1 m2 m3 m4
#> 1   u1  2  4 NA NA
#> 2   u2  3  5 NA  1
#> 3   u3 NA  1  2  2
#> 4   u4 NA NA  3 NA
str(dat)
#> 'data.frame':    4 obs. of  5 variables:
#>  $ user: chr  "u1" "u2" "u3" "u4"
#>  $ m1  : int  2 3 NA NA
#>  $ m2  : int  4 5 1 NA
#>  $ m3  : int  NA NA 2 3
#>  $ m4  : int  NA 1 2 NA

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(tidyr)
dat %>% 
  pivot_longer(cols=c(m1:m4), names_to="movie", values_to="rating") %>% 
  group_by(user) %>% 
  summarise(m_rating = mean(rating, na.rm=TRUE))
#> # A tibble: 4 x 2
#>   user  m_rating
#>   <chr>    <dbl>
#> 1 u1        3   
#> 2 u2        3   
#> 3 u3        1.67
#> 4 u4        3

dat %>% 
  pivot_longer(cols=c(m1:m4), names_to="movie", values_to="rating") %>% 
  group_by(movie) %>% 
  summarise(m_rating = mean(rating, na.rm=TRUE))
#> # A tibble: 4 x 2
#>   movie m_rating
#>   <chr>    <dbl>
#> 1 m1        2.5 
#> 2 m2        3.33
#> 3 m3        2.5 
#> 4 m4        1.5

^{Created on 2020-05-18 by the reprex package (v0.3.0)}

HTH

Vicky_Das · May 18, 2020, 10:56am

Hi David,

Thank you for your message. However, I was able to convert the data frame into matrix and then recommend the movie ratings through "UBCF" method. Let me know what you think?

# train1 has the 75% of the random data from actual dataset
train1 <- as.matrix(train)
train <- train[-1,]
train1<- as(train1,"realRatingMatrix") 
dim(train1)

# Creation of the model - U(ser) B(ased) C(ollaborative) F(iltering) 
Rec.model<-Recommender(train1[1:3636], method = "UBCF")

#Then used my recommendation model on "test" dataset.
# test1 has 25% of the random data from actual dataset
class(test)
test1 <- as.matrix(test)
test1<- as(test1,"realRatingMatrix")

dim(test1)

predicted.user <- predict(Rec.model, test1, type="ratings") 
View(as(predicted.user, "data.frame"))

# Next I cross validated to check the missing ratings from one of the user ID 
# to see the predicted value for ID no. "40" we didn't have any value before or had "NA" earlier 
View(as(predicted.user["40"], "data.frame"))

nirgrahamuk · May 18, 2020, 11:03am

I formatted your code by placing 3 backticks on a line between your text and your code.

system · June 8, 2020, 11:03am

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.