Font Size: a A A

Research On Multi-document Opinion Summarization For Product Reviews

Posted on:2017-04-14Degree:DoctorType:Dissertation
Country:ChinaCandidate:W WangFull Text:PDF
GTID:1108330503469600Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Opinion summarization, also known as sentiment summari zation, is a technology to analyze texts, summary contents, and generate summarization for those subjective texts with sentiment information. New application requirements of opinion summarization are coming out at an unprecedented rate under the rapid growth of subjective documents on the Internet. These requirements have brought new opportunities and challenges for NLP(Natural Language Processing) researches. In recent years, related studies of opinion summarization have attracted some researchers and made some achievements. Researchers try to apply opinion summarization into some practical applications, such as decision support, public opinion monitor, and information forecast and etc.Opinion summarization mainly includes three parts: sentiment element extraction, sentiment orientation recognition, and sentiment information induction, which all belong to sentiment analysis field. This paper studies three key contents of opinion summarization, in which sentiment element extraction and sentiment orientation recognition are fundamental research. The aim of them is to recognize effective evaluative units(aspects, opinion words and etc) from evaluative texts, and determine the orientation of these units. Sentiment information induction is an application task, the aim of which is to generalize important evaluative information and form a concise and refine summarization. Study on comprehensive ranking for products is also performed in this paper. The main contents in this paper are as follows:Extraction of comparative elements using conditional random fields. Feature selection is crucial to modeling in statistical learning methods. This paper proposes some linguistics-related features, such as shallow syntactic feature, comparative word candidate, and heuristic position feature and etc. These features are merged into conditional random fields algorithm. Experiments show that shallow syntactic feature is an effective feature to recognize phrase elements. Comparative word candidate can not only make up for insufficient training data, but also locate other elements. Heuristic position feature is helpful to distinguish elements with similar Part of Speech. The proposed approach can improve every performance index of comparative element extraction.Intra- inter- opinion features integrated ambiguous opinion words orientation recognization. Ambiguous opinion words refer to words whose sentiment orientation changes with context. In previous studies, inter- orientation features were mostly investigated, whereas intra- orientation features were not noted. This raises the problem of low orientation recognition rate. This paper proposes an unsupervised approach that intra-/inter- opinion features are integrated. The approach introduces two intra- opinion features: modifier and high frequency collocation, which can effectively solve the low precision rate problem of orientation recognization. The problem of the low recall rate are effectively resolved through two inter- opinion features integrated.Product multi-attribute ranking based on Analytic Hierarchy Process(AHP) model. Comprehensive evaluation is to assess multi objects with several indexes(ranking or preferred). This paper proposes a modeling method of product comprehensive ranking, which ranks products by building an AHP model. An AHP model first decomposes an evaluation problem into a hierarchy more easily comprehensive sub-problems, such as goal, criteria, subcriteria, and alternatives etc. On the basis, quantitative analysis is performed by calculating weight of each element relative to an element in the upper hierarchy. Finally, the combination weights of several levels are computed for product comprehensive rank ing. This method also integrates graph model and user requirements during modeling, which can effectively solve the product comprehensive ranking problem.Template-based abstractive multi-document opinion summarization. Although multi-document summarization is constantly being investigated as part of the annual Text Analysis Conference(TAC), summarizing a large number of evaluative documents is still a novel and challenging computational task. This pape r proposes a template based natural language generative method. The method first extracts opinion information(evaluated entities, aspects and etc) and judges sentiment orientation. The opinion informtion extracted from documents is then mapped to standard information and organized to an UDA(User-defined aspects) tree. Finally, an opinion summarization is built, including summarization structure planning, template designing, sentence generation, and content selection and etc. This paper constructs three types of opinion summarizations: overall summarization, an entity summarization, and contrastive summarization. The paper effectively explores the method of abstractive opinion summarization.
Keywords/Search Tags:sentiment analysis, opinion summarization, comparative element extraction, sentiment orientation recognization, entity comprehensive ranking
PDF Full Text Request
Related items