Font Size: a A A

Research Of Sentiment Analysis On English Blog Text

Posted on:2012-03-21Degree:MasterType:Thesis
Country:ChinaCandidate:Z Z WangFull Text:PDF
GTID:2178330338457708Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The Internet has entered the era dominated by Web2.0. As the typicalrepresentative of Web2.0, the blogs has been widely used all over the world. With therapid development and popularization of Web2.0, the interactive of user participate innetwork increase greatly, now people from passively accept information to initiativelypublish their views and opinions on the Internet. Because of the user participate in thegeneration of information, more and more personal opinioned contents appear theblog, BBS and other network media. Such online expression has great meaningful andvaluable for many applications, such as e-commerce, public opinion, informationretrieval and so on. The automatic sentiment analyzing of the opinioned contents onthe web is recently becoming a hotspot in the study on web information processing,and its core technology is text sentiment analysis.In this paper, the sentiment analysis has researched for English blog space text,applying the traditional text classification technique to text sentiment analysis, themian contribute of this dissertation are summarized as follows:(1) Design the schematic diagram of text sentiment analysis, increased thepretreatment operation, added the stop words removal technique which usually onlyused in information retrieval. Meanwhile, we omitted the stemming using bytraditional text sentiment analysis, through the experiment showing the superiority ofthe method presented in this paper.(2) Study the feature selection. This paper differs from the bigram orN-Gram(N>2) as the traditional text classification features, using the word as thebasic unit of sentiment feature and thus reducing the computational complexity.(3) Introduce several typical algorithms based on machine learning theory andanalyze the advantages and disadvantagesof them, selected the Support VectorMaching(SVM) to construct the classifier.(4) According to the suggested method and madel by this paper, realized thesentiment analysis system based on English text.(5) This paper inherited and expanded the existing text classification evaluationindexes, establishing a special evaluation system for text sentiment analysis.Finally, this paper propose the experiment design, using three kinds featureselection methods(information gain, mutual information and chi-square statistic) for training corpus, and selecting support vector machine to construct the classifier.At last,the new established evaluation system are used to compare and analyze theexperimental results. The results indicated that the machine learning method can beused to text sentiment classfication, the information gain feature selection obtainedthe optimal effect. Compared with the existing other methods, this paper is superior totheir results, the highest precision achieved 83.7%.
Keywords/Search Tags:Blog, English Text Sentiment Analysis, Preprocessing, Feature Selection, Evaluating Indicator
PDF Full Text Request
Related items