Font Size: a A A

Research On Integrated Opinion Trarget Extraction For Chinese Product Reviews

Posted on:2016-12-15Degree:MasterType:Thesis
Country:ChinaCandidate:M J SongFull Text:PDF
GTID:2308330464459053Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the development of electronic commerce, the Internet produces a large amount of evaluative information from Chinese product reviews. It is not realistic for people to analyze this mass product evaluation information by using artificial method, Thus, the advent of opinion mining technology for information processing is necessary.Many methods have been employed for this task. Methods for opinion mining extraction and analysis is mainly divided into two ways: supervised machine learning algorithm and unsupervised machine learning algorithm. Supervised methods require manually annotated data set. Common steps are training data set into models by machine learning methods for text classification, feature detection, etc. Although the opinion mining extraction can get good results from supervised algorithms, but on preparation step of data set we needs a lot of manual work, it is time consuming and requiring related technical knowledge. In addition, supervised algorithms is often adapt for specific domain instead of a universal application. Unsupervised algorithms do not need manually annotated data set, and most of unsupervised methods are domain-independent, so unsupervised algorithms are more widely used, especially for analysis of Chinese product reviews. Unsupervised methods can be applied in various fields of product reviews. This paper uses unsupervised algorithms for opinion target extraction.However, due to the complexity of Chinese text, an integrated opinion target is usually composed by several word units and the study of integrated opinion target extraction is challengeable. For analyzing the integrity of opinion target, unsupervised algorithms is mainly on three ways:(1) is high frequency string extraction, this method often recognize strings that appear together and with high frequency;(2) is defining part of speech models, this method often define some fixed POS templates;(3) is syntactic dependency method, it generally analyzes two structures "subject" and "object" after parsing, and then extract features based on syntactic dependency relations. An Unsupervised method is proposed after analyzing. This technique identifies an opinion target by expanding base word units(BWUs) and analyzing the integrity value, the lack value and the stability value of new words. Finally a pruning strategy is employed by overall stability and co-occurrence with opinion models. Empirical results show the validity of the proposed technique, especially on the infrequent and complex opinion targets.
Keywords/Search Tags:Opinion target, Integrity, part of speech models, Deficiency, Chinese Product Reviews
PDF Full Text Request
Related items