Font Size: a A A

Sentiment Clustering Of Evaluation Object Based On Incomplete Information Systems

Posted on:2013-05-04Degree:MasterType:Thesis
Country:ChinaCandidate:X Q YinFull Text:PDF
GTID:2248330374956472Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
With the rapid development of computer and network technology, nowadays, people prefer to express their subjective opinions through the Internet, such as views of price, performance and after-sale service. These subjective opinions show consumer products love or aversion, and other emotional tendencies. Usually, people would check online reviews of the product for reference. Subject to the restrictions of time and effort, it is not feasible to obtain relevant information from mass comments. Besides, a product may be involved in many comments in reality. Previous works make an assumption that a document or a single sentence only describes one product. No matter document-level or sentence-level, it is difficult to get comprehensive evaluation of each performance. Therefore, it is very important to analysis and processes the subjectivity opinions of the relevant product from various comments automatically and effectively.The main work is divided into the following aspects:(1) Opinion feature extraction and sentence summary based on ontology.In this paper, opinion features are extracted in accordance with the product performance based on established domain ontology. Without word segmentation, features are extracted in accordance with the matching rules directly. The experimental results show that, the matching rules method’s F value is higher than55.83%. Then, those sentences are merged which describe the same product together and summarizing them according to product performances. In this case, it is changed into the traditional document-level sentiment analysis problem. Considering the relationships between opinion features, relevant features are merged together as a "core word". Seeing the experiment results, for the different performance-related data richness level, there are large differences between each performances. The number of integrated features included in the "economic" is tiny. After the integration, the scale of the new feature set of performance "comfortable" is only40.87%of the original feature set.(2) In accordance with the product performances, establishing incomplete emotional information systems and reduce the feature dimension.For the default values are existed, the systems we established are incomplete information systems. For each feature, we identify its semantic orientation by its own semantic orientation and sentence orientation. For the problems of high dimension and missing data, we adopt the feature reduction algorithm based on discernibility matrix to reduce the feature dimension. The high concern performance "comfort", the reduction rate is up to55.32%, which greatly reducing the redundancy and improving the similarity of opinion features.(3) Rating each product by clustering algorithm.The overall preview of the product evaluation and the actual evaluation status are get through the clustering of the comments. By rated each product by K-means clustering algorithm, the experiment results show that the clustering results and users’ cognitive is basically the same. In order to illustrate the effectiveness of feature reduction algorithm in our work, we design a experiment which use LSA as the method of feature reduction. But LSA make the feature space changed, it is hard to explain the meaning of specific feature. In this section, the necessity of incomplete information systems is also described.
Keywords/Search Tags:Incomplete information systems, Evaluation of the objects, Ontology, Feature dimension reduction, Clustering
PDF Full Text Request
Related items