Font Size: a A A

Research On Key Algorithm Of Chinese Semantic Orientation Identification

Posted on:2009-09-23Degree:MasterType:Thesis
Country:ChinaCandidate:J M ZhangFull Text:PDF
GTID:2178360245469818Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
The task of Chinese text semantic orientation identification is to identify the subject orientation of a document with specified topic, to determine whether it holds positive attitude or negative attitude with regard to the specified topic. Semantic orientation identification belongs to the scope of natural language processing and is an important field of comprehensive information theory based natural language comprehension research.In this paper, a VSM representation algorithm based on the semantic orientation information of words is proposed, named SOVR algorithm. The algorithm makes use of all the three levels of comprehensive information, which comprised of syntactic, semantic and pragmatic information; it combines both of the statistics-based and rule-based approaches; integrates general domain and specific domain information. The SOVR algorithm can be utilized as the pre-processing module before the traditional machine learning module; it generates the VSM represented input of comprehensive information model. The results of the experiments indicate that by comparing with other machine learning methods and orientation measurement methods, our algorithm can represent deeper information of the text, such as the semantic and pragmatic information. Also, it can deal with the noisy documents retrieved from internet and it shows a good robustness when applied to various corpuses belonged to different domains. As a result, it reaches the best performance 90.79% and 92.21% while combined with C4.5 decision tree and SVM algorithm, respectively. We provide a new and effective solution for the semantic orientation identification in Chinese text.
Keywords/Search Tags:Natural Language Processing, Semantic Orientation Identification, Vector Space Model
PDF Full Text Request
Related items