International Journal of All Research Education & Scientific Methods

An ISO Certified Peer-Reviewed Journal

ISSN: 2455-6211

Latest News

Visitor Counter
2269815093

Text Classification on twitter data

You Are Here :
> > > >
Text Classification on twitter data

Text Classification on twitter data

Author Name : R. Swetha, Saleema, K. Dharani, M. S. Abu Tahir, U. Sai Surekha

ABSTRACT

Sentiment analysis is a classification problem where the main focus is to predict the polarity of words and then classify them into positive or negative sentiment. Classifiers used are of mainly two types, namely lexicon-based and machine learning based. The former include Sent WordNet and Word Sense Disambiguation while the latter include Multinomial Naive Bayes(MNB), Logistic Regression(LR), Support Vector Machine(SVM) and RNN Classifier. In this paper, existing datasets have been used, the first one from “Sentiment140” from Stanford University, consisting of 1.6 million tweets and the other one originally came from “Crowd flower’s Data for Everyone library”,consisting of 13870 entries, and both datasets are already categorised as per the sentiments expressed in them. Textblob, Sentiwordnet, MNB, LR, SVM and RNN Classifier are applied on the above dataset and a comparison is drawn between the results obtained from above mentioned sentiment classifiers, classifying tweets according to the sentiment expressed in them, i.e.,sspositive or negative. Also, along with the machine learning approaches, an ensemble form of MNB, LR and SVM has been performed on the datasets and compared with the above results. Further the above trained models can be used for sentiment prediction of a new data.

Key words: Twitter; sentiment analyzer; machine learning; WordNet; word sequence disambiguation (WSD); Naïve Bayes.