Font Size: a A A

A Summary Of Commodity Reviews Based On The Analysis Of Comments And Reasons

Posted on:2015-11-16Degree:MasterType:Thesis
Country:ChinaCandidate:L ChenFull Text:PDF
GTID:2208330467989322Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
Opinion summarization, a hot issue in in the field of comments mining, has attracted widespread attention. In social network, news portals, and e-business shopping websites, the amount of comments and redundant information are very huge, so the need for a brief summary of comments to automatically generate user-friendly view is essential. This paper has made a in-depth study of traditional issues about opinion summarization, and proposes a summarization method for product reviews based on opinion-reason association analysis. The problem of product review will be divided into the following three sub-problems by this method:To extract opinion sentences; association analysis between opinion sentences and non-opinion sentences; to generate opinion summarization.To solve the problem of extracting opinion sentences, this paper proposes an approach based on association rule mining. The structure of subjective viewpoint sentences contains both commodity feature words and emotional words. So the method aims to extract comment sentences that corresponds to the structure. First, extract words that frequently occur in the comment sentences as a candidate set of commodity feature words using association mining method based on Apriori algorithm. Next, filter out non-nominal terms in the candidate set and combine the position-adjacent words into word groups to get commodity feature vocabulary. Finally, traverse sentences that contain commodity feature words to find the nearest adjective as the emotional words of the sentence.For association analysis between opinion sentences and non-opinion sentences, this paper transforms the problem into a text-matching problem, and proposes a hybrid classification model based on correlation and statistical features to solve the problem. First of all, association feature refers to words co-occurrence feature, while statistical features contain three features:the sentence’s position, the length of the sentence, and neighborhood correlation. Because users are accustomed to using fixed collocations to describe the commodity attributes, this paper proposes the association rule mining method to get the frequently occurring collocations as the words co-occurrence feature of the classification model, which is the association characteristic in mixed features. The sentence’s position refers to whether the two input sentences are adjacent to each other. The length of the sentence refers to the number of words in the sentence. Neighborhood correlation is a statistical characteristic based on TF-IDF weighting, which is used to express whether the words belong to common words of specified category of product reviews. Second, establish a classification model according to the above four features, using SMO algorithm to train the classification model, which divides the optimization problem into several sub-optimization problems, reducing the computation time. Third, compare the two cross-test results, and verify that the classification model has a strong ability to promote different types of product reviews in the data set.For the problem of generating opinion summarization, this paper proposes to use word mining in different sentences to get words co-occurrence feature and cluster similar view sentences; and also proposes to use rule mining method to extract commodity-related feature words, clustering the corresponding sentences.
Keywords/Search Tags:Opinion summarization, Mixed feature, Rule mining, Associationanalysis
PDF Full Text Request
Related items