Lindley Coetzee

Exploring the "MovieLens" Dataset : Part 2

Introduction

This is part 2 of the "MovieLens" dataset series(If I can even call it that). You can check out part 1 here and download the dataset here. For the code you can check out my notebook here.

In this part I am gonna attempt to answer question two

  1. What was the average top rated movies for the last 10 years?
  2. What was the average top rated genres(Animation, Sci-fi of Horror) for all of the years in the dataset?

Acquire and prepare the data

The preparing of the DataFrames was covered in part 1. Here we will look at the three different genres. Lets create three dataframes : Animation, Sci-fi and Horror and display the first five rows.

Animation
Sci-fi
Horror

Analyze data and communicate results

Now we can get the average rating per year per DataFrame(genre). The list goes back a few years so I'm only gonna display the last ten years' ratings for each genre.

Animation movies : average rating per year
Sci-fi movies : average rating per year
Horror movies : average rating per year

Lastly, we can plot the ratings using a line chart and hopefully we can see which genre has the highest average rating over the years.

From the above chart, it would appear that the Animation genre has the highest average rating. Horror movies got the lowest average ratings especially from about the year 1980 where it was only two years(around 1993 and around 1999) not the worst rated genre.

I prefer sci-fi, than animaiton or horror movies. Most people, as the data suggests enjoy animation movies more, or merely rated animation movies higher than sci-fi or horror movies while not ncessarlity enjoying them more or less than other movie genres. But I don't think that is the case.