Sentiment Analysis And Related Issues For Twitter

Posted on:2015-10-30

Degree:Master

Type:Thesis

Country:China

Candidate:J Zhu

Full Text:PDF

GTID:2298330452450764

Subject:Computer application technology

Abstract/Summary:

In recent years, with the development of the Internet and Mobile Internet, socialnetwork also has gotten a rapid development. The action that the users generated textinformation actively through Internet marks the person is no longer a simple audience,but a part of the Internet. The mobility of microblog, and the sharing, simplicity andrealtime performance of contents have made microblog become an indispensablesocial network that used for interaction in the daily life of Internet users, most ofusers have spaces and freedoms to express their opinions. This expressions orcomments can be just a simple message from common user, or a purchase intentionsfrom network consumers, or a movie comment from movie fans, or some opinionsabout the policies and regulations which have published by governmentadministration from cyber user, how to get a valuable content from the vast amountsof unstructured short text information has become a problem to be solved at present.The popular of social network led the birth of a new research field that is themicroblog sentiment analysis. Microblog sentiment analysis inherits the characteristicof text sentiment analysis that analyzes the emotion tendentiousness from theemotional expression of microblog, the result of the analysis is to divide themicroblog sentiment into a positive or negative class, or positive, negative and neutralclass. so that researchers can clearly know that the attitude expressed by the text issupport or against, thus make the corresponding decision.In this thesis, we mainly study how to use the traditional text classificationmethod applied to the sentiment classification of microblog. Considering use themachine learning method to implement the Twitter sentiment classification. In thisthesis, we analyze the critical technical problem about the Twitter sentiment analysis,and focus on the research which is the processes and methods for increasing accuracyof classification; in this thesis, we also analyze the influences on Twitter sentimentclassificational accuracy whether come from different methods of feature extraction,feature weight calculation, text representation and the construction of the classifiermodel.In this thesis, we use the Twitter as the dataset and then use the part-of-speech tagger tool which is developed from the Stanford Natural Language Processing Groupto preprocess the tweets. After text preprocessing, we choose three different kinds offeature extraction methods those are document frequency, information gain andchi-square to extract the features from the dataset, and then respectively use booleanweighting, term frequency and TFIDF(Term Frequency Inverse Document Frequency)to calculate the weight of features. Lastly, two kinds of classifier are used which arebased on supervised learning method to classify the text sentiment and they are NaiveBayes Classifier and Decision Tree Classifier. In this thesis, we have tried on manyexperiments in using different number of features, feature weightings andclassificational algorithms to train classifier and then used test data to test thoseclassifier. The experimental results indicate that the performance of combination ofNaive Bayes, CHI and TFIDF is the best in those experiment in this thesis.

Keywords/Search Tags:

Sentiment analysis, Text classification, Feature extraction, Feature weight, Supervised learning

Related items

1	A Study Of Text Classification Algorithms Based On Feature Selection
2	Research On Text Sentiment Classification
3	Research On High Performance Chinese Text Classification Based On Machine Learning
4	Research On Feature Generation Methods For Text Sentiment Classification
5	Research On Feature Extraction And Classification Algorithm In Text Categorization
6	Research On Classification Of Massive Text Feature Under Distributed Architecture
7	Feature Weight Optimization For Short-Text Multiclass Classification
8	Research On Text Sentiment Classification Based On Deep Learning
9	Research Of Product Feature Extraction And Sentiment Analysis Base On Chinese Online Reviews
10	Research On Chinese Short-text Sentiment Analysis