GitHub - xrrays/steam-sentiment-analysis-model

🎮 Steam Review Sentiment Analysis with Metacritic Comparison | Machine Learning

This notebook performs sentiment analysis on Steam game reviews using three machine learning models:

It uses two datasets:

Steam Reviews Dataset: A dataset containing raw reviews with language, game title, and a "recommended" label.
- https://www.kaggle.com/datasets/najzeko/steam-reviews-2021/data
Metacritic Dataset: A dataset with critic and user scores per game.
- https://www.kaggle.com/datasets/thedevastator/video-game-ratings-and-reviews-dataset

Load and input data from datasets using config.json
Clean and pre-process reviews (balance and filter reviews for model evaluation)
Vectorize text using TF-IDF
Train/test split + model training
Evaluate models (accuracy, precision, recall, F1, confusion matrix)
Compare model-predicted sentiment vs actual Metacritic scores (correlation + graphs)
Bonus insights:
- Most common positive/negative words
- Random review samples (with model correctness for error analysis)

No setup needed beyond having config.json and the datasets in your right file path
Simply run each cell in order

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.gitignore		.gitignore
README.md		README.md
main.ipynb		main.ipynb