.csv files contain the Twitter US Airline Sentiment dataset (containing tweets and whether the sentiment of the tweets in the set was positive, neutral, or negative for six US airlines, available also on kaggle) tweets were cleaned. tfidf vectorizer was used along with 3 classifiers:
- Logitistic Regression
- SVC
- Multinomial Naive Bayes
Logistic Regressions outperformed the rest with a score of 0.7929. Could probably do a lot better with some hyperparameter tuning.