Font Size: a A A

Research And Implementation On Textual Orientation Classification Of Web Review Text

Posted on:2011-11-29Degree:MasterType:Thesis
Country:ChinaCandidate:D F DanFull Text:PDF
GTID:2178330338489902Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Along with the rapid expansion of information technology throughout the world, the Internet has become the main carrier reflecting popular sentiments. Currently, internet public opinion forms quickly and has the huge impact to society, monitoring and forecasting of which has become more and more important, and textual orientation classification is one of the hottest spot in it. Textual orientation classification is text mining of user's view, review or opinion on things or events, which is to determine the view or opinion is positive or negative. Textual orientation classification is highly regarded for its value in information filtering, information security, public opinion monitoring.Based on the study of current situation of existing Chinese textual orientation classification method, this thesis focused on considering the relevance of web review texts to improve the effect of textual orientation classification. First of all, considering the feature of webpages of comment, the professional crawler is designed to gather it. And according to the characteristics of review text, the special parser is designed to extract them and their relevancy and to ready for the next textual orientation classification. Secondly, on account of the relevancy of review texts and CAAR algorithm, the integrated textual orientation classifier is constructed. Finally, using above mentioned study achievements, the results of experiment confirmed our thoughts. The aim of this thesis is to improve the effect of textual orientation classification.The main contents are as the following four aspects:(1) Study existing textual orientation classification technologies and the characteristics of web review text, analyze traditional orientation classification methods on the availability of such data as well as shortcomings, to find the appropriate solution.(2) The traditional methof of data collection couldn't gather unabridged data. Considering the feature of webpages of comment, the professional web crawler called Deep-Crawler is designed to gather it. And according to the characteristics of review text, the special parser called Deep-Parser is designed to extract them and their relevancy and to ready for the next textual orientation classification.(3) Analyze the shortage of current textual orientation classification algorithm in web review text, and make use of relevancy of review texts to improve the effect of textual orientation classification. Give the concept of relevancy and correlation, according improved SBV polarity transfer algorithm and the relevancy and the correlation of review texts, propose a textual orientation classification algorithm: CAAR, also confirm availability of CAAR algorithm, improve performance of textual orientation clasificatier.(4) Using the above research results, this thesis designed and implemented an archetypal system of web review textual orientation classification for Public Opinion in Internet based on YHPODS for the follow-up developments. And at the same time, described detail of the primary module in the thesis.
Keywords/Search Tags:Internet Public Opinion, Textual Orientation Classification, Web Crawler, CAAR Agorithm
PDF Full Text Request
Related items