Twitter Sentiment Analysis to classify tweets as sexist/racist or not sexist/racist based on the sentiment associated with it.
Twitter is a social networking and news website where users exchange short messages known as tweets. The opinions expressed in tweets are used for sentiment data upon analysis. Sentiment analysis is a technique that automatically identifies sentiments in social media interactions. In this project, our goal is to perform Twitter Sentiment Analysis to classify tweets as sexist/racist or not sexist/racist based on the sentiment associated with it. The datasets used in this project consist of tweets and labels where label ‘1’ means the tweet is sexist/racist and label ‘0’ means the tweet is not sexist/racist. In other words, our task involves predicting the labels on the datasets whether it is ‘0’ or ‘1’.
Sentiment analysis understands human language in those Tweets using Natural Language Processing (NLP) and machine learning to deliver accurate results automatically. Supervised machine learning technique is applied in this project as we use labelled data in the training set to fit the model and do classification on the tweets.
The algorithm that we have used is Long short-term memory (LSTM) and Embedding. LSTM networks are a variant of artificial recurrent neural networks (RNN) that may learn order dependence in sequence prediction challenges. The accuracy produced by the sentiment analysis model is expected to be more than 85%. At the end of the project, we managed to get 99% of accuracy in predicting the sentiment of the tweets.
