Nowadays, movies have become an integral part of entertainment in everyone’s life. Entertainment giants like Netflix and HBO are investing billions of dollars a year to produce high quality movies with the most popular actors and directors of the cinematographic industry for its audience.
The goal of the project is to analyze the factors that contribute to the success of a movie through time. We intend to look into aspects including but not limited to the cast, the movie budget, writers and directors, time of first screening, genre and the IMDB rating of movies. The analysis can give us insights on the evolution of the interest of people towards movies with time. We will use datasets provided by IMDB and we will also use web scraping to obtain additional information required from imdb.com.
- What are the factors that contribute to make a movie an oscar-winning one? Is it the actors? Is it the director? Is it the genre? or is it something else?
- What makes a movie a blockbuster? Are high budget movies always successful?
- How has the interest of people towards movies changed through time?
- What is the impact of these changes in interest on the success of certain types of movies?
- What is the impact of gender on the success of a movie
- How does the geographical reach of a movie impact its success
We intend to use datasets provided by IMDB: https://datasets.imdbws.com/.
We will also scrape pages from https://www.imdb.com/ that refer to specific movies and actors to obtain information in addition to those available in the datasets.
- Inspect the datasets and determine the missing information that needs to be web scraped and the unnecessary information that needs to be ignored.
- Write code to scrape the required data from web and store them locally
- Prepare the final data sets by merging the scraped information with the existing datasets
- Perform exploratory data analysis on the data to obtain a generic idea on the data as well as to identify any possible data issues
- Clean the data and prepare the dataset for analysis
- Perform data analysis focussing on the mentioned research questions
- Draw relevant conclusions based on the analysis findings
- Clean up notebook and ensure all observations and conclusions have clear descriptions with proper visualizations
Nothing for the moment