Font Size: a A A

Researches On Key Issues In Opinion Mining

Posted on:2012-06-06Degree:DoctorType:Dissertation
Country:ChinaCandidate:F LuoFull Text:PDF
GTID:1118330368986366Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the popularization of the Internet and the rapid development of E-commerce, the Web storages huge number of customers reviews about products. These reviews contain customers positive or negative feelings about product performance, functionality, etc. Businesses or manufacturers can analysis these customer reviews, and access to consumer feedbacks in time to improve product performance and after-sales service. Potential consumers can obtain some product-using experiences from the online reviews to purchase products more reasonably. However, dealing with an enormous amount of unstructured or semi-structured reviews in a manual way would be extremely expensive and time consuming. Therefore, the research of opinion mining about customers reviews has attracted more and more attentions, and it has been becoming a hotspot in recent researches on Web information processing.In this dissertation, the researches aimed at some key issues of opinion mining, exploring the concrete modes and effects provided by domain ontology, and achieved this tasks combined with the information extraction, text mining and natural language processing techniques. This dissertation emphasized particularly on methodology research associated with empirical analitic study, proposed new methods based on domain ontology and obtained the following achievements:Firstly, based on analyzing existing methods and techniques of domain ontology construction, a incremental iterative method was proposed to construct domain ontology, and it divided the process of domain ontology construction into three phases and ten levels. Using this method enriched and consummated the knowledge framework of domain ontology through instances establishment, and a Notebook Ontology was constructed for Product Named Entity Recognition (PNER).Secondly, based on exploring and analyzing the tasks and methods of product named entity recognition, a Conditional Random Fields (CRFs) model was applied to PNER, and the key technologies of the identification process, such as the size selection of "observation window", the selection of modeling granularity, the determination of labeling schemes and the selection of feature were verified by experiments. In order to further improve the performance of PNER, a new external feature, namely the domain ontology feature, was introduced to the CRFs. Experimental results showed that the combination of internal and external features performed quite well and the F-measure of ETY, ATT, PART on the test set achieved the desired results.Thirdly, based on researching the methods of the traditional topic-based text classification, machine learning was performed to the coarse-grained sentiment classification of reviews. To solve data sparseness, the sentiment Vector Space Model (s-VSM) was used to represent text. The critical issues of the sentiment classification, i.e. the selection of classification algorithms, the determination of feature selection method and the selection of feature dimension, were verified by experiments. Furthermore, in order to consider the entire corpus contribution of features and each category contribution of features, the feature selection method of Chi-square Difference between the Positive and Negative Categories (CDPNC) was proposed. It combined DF with CHI and had the better performance. Experiments showed that the Macro-F and Micro-F achieved 90.18% and 90.08% respectively.Fourthly, based on introducing semantic analysis to the sentiment classification, dependency parsing was performed to extract feature-opinion. Since the differences between Chinese and English language, the semantic orientation computing based on Pointwise Mutual Information (PMI) cannot be directly applied to the sentiment classification of Chinese reviews. Considering the practical application, this dissertation improved the benchmark of positive and negative word, threshold and so on, and verified that applying the semantic orientation computing based on PMI to the sentiment classification of Chinese reviews is feasible, and can overcome the weakness of the semantic similarity computing based on HowNet.Finally, based on the aforementioned researches, an opinion mining prototype system was designed and implemented. It can comprehensively mining the customers reviews about the product overall and in detail. Using this system, users can get visualized results and this will be helpful for their decision making.
Keywords/Search Tags:Opinion Mining, Domain Ontology, PNER, Sentiment Classification, Semantic Orientation
PDF Full Text Request
Related items