| With the rapid development of computer technology and the Internet, the number of network comments is increasing. The commercial websites, blogs, microblog, forums and other network media arerapidly becoming the platform for people to express their viewpoints, suggestions and opinions. For a popular product, if users obtain a great deal of interest information from Web reviews, it is impossibility by browsing. There are important theoretical significance and practical values for ordinary consumers, e-commerce and network monitoring, and so on by automatic and effective analysis and processing, summary and conclusion of subjective texts with sentimentThis thesis focuses on studying sentiment orientation clustering issues based on Web automobile comments, and carries out clustering research on the sentiment orientation of texts and aspects respectively, on the basis of establishing review data and evaluation collocation. The main content is as follows:(1) Establishing Web comment databaseThis thesis establishes comment text database by obtaining relevant comment text, classified statistics and sorting of comment text data. Combined with domain ontology knowledge, the evaluation objects and evaluation words of comment text are defined and analyzed, and the co-reference among evaluation objects are classified accordingly.(2) The sentiment orientation clustering of comment textWe do the feature vectorization expression of the text and propose linear weighted weight calculation by combining sentiment orientation of feature evaluationwords with the tendency of sentences. Finally, we use the method of K-Means to realize sentiment clustering of text. To verify the effectiveness of this method, we make experiment in real automobile comment text data. Experimental results show that sentimentorientationfeature representation is significantly improved in terms of the purity and F-value of clustering, compared with expression by Boolean and LDA weighting, indicating that the representation method proposed in this thesis is feasible and effective. In terms of evaluation level of sentiment orientation, five grades including poor, slightly poor, fair, slightly good and good are used, which is conducive to better analysis and application of relevant data.(3) Product sentiment clustering based on aspectsTo make more in-depth study on product comments, sentiment clustering of automotive products is carried out from six aspects including comfortableness, controllability, power performance, economic efficiency, safety and service for comment text of automobiles. This thesis provides all aspects of features and evaluation of automobile products, which could not only have a more accurate and comprehensive understanding of the real intention of views, but also help decision-makers make correct decisions. As for problem of multiple comment objects in a comment text, this thesis combines the information of automotive products, adds semantic features, and identifies opinion sentences based on ontology on the basis of named entity recognition. The integration of opinion sentences and evaluation objects or aspects association relation further improves the effects of sentiment clustering based on aspects. |