Font Size: a A A

Research On Semantic Orientation Classification Of Chinese Textsbased Evaluation Objects And Affective Characteristics

Posted on:2011-04-22Degree:MasterType:Thesis
Country:ChinaCandidate:J ZhuFull Text:PDF
GTID:2178360308452596Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
With the spreading of Internet in recent years, the amount of on-line reviews has been fast growing. Analyzing these reviews and identifying the semantic orientation contained, will have great significance and practical values on E-Commerce, network supervision, etc. So semantic orientation classification is generally becoming a hot research direction in the field of natural language processing.This paper focuses on the semantic orientation classification of Chinese texts, which aims to classify the Chinese text into the positive or the negative class on the basis of analyzing the semantic orientation of text. As the complexity of semantic expression, applying the traditional text classification methods based on machine learning into the field of the semantic orientation classification will not attain good effects. To improve the classification performance, we try to add more semantic information into the classification system. Finally we design and implement a semantic orientation classification system of Chinese texts based on evaluation objects and affective characteristics.Our main work and contributions include:1) investigate the classification performance of applying traditional text classification methods based on machine learning into semantic orientation classification of Chinese texts. Through comparison experiment by means of different stop word lists,feature selection methods,feature weighting assignment methods and classifiers, we finally find that it will obtain high performance when using stop word list which can remain most part of speech containing semantic information and applying support vector machines classifier based on TF-IDF weighting assignment method.2) investigate the method to obtain the affective characteristics candidate set. Based on words from《HowNet Word Set for Semantic Analysis》and extended by《Tongyici Cilin Extended》, we can get the list of common affective words.3) investigate the method to identify the evaluation objects and affective characteristics in text. Considering all semantic information expressed will focus on certain objects, there is need to identify the evaluation objects and according affective characteristics as the important characteristics reflecting semantic orientation.4) propose the text vector model based on evaluation objects and affective characteristics. We combine the semantic information with traditional text vector model successfully by using the triple as text vector feature.5) propose the feature weighting assignment method named TSF-IDF. Integrating term semantic frequency (TSF) with inverse document frequency (IDF), we take both the semantic orientation in document and the importance in document set of the feature into account.6) implement the semantic orientation classification system of Chinese texts based on evaluation objects and affective characteristics. In experiments of the system, we test two kinds of corpus as hotel reviews and movie reviews with support vector machines classifier and get the precision as about 89 percent and 87 percent separately, which are better than the results using traditional text classification methods.
Keywords/Search Tags:semantic orientation classification, dependency parsing, evaluation objects, affective characteristics
PDF Full Text Request
Related items