Font Size: a A A

Research On The Method Of Multi-document Summarization Based On Topic Model Of Opinion Mining

Posted on:2019-03-24Degree:MasterType:Thesis
Country:ChinaCandidate:Y RenFull Text:PDF
GTID:2428330548461162Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the accelerating pace of the development of information technology,Internet applications have become inseparable from all aspects of social life,and the resulting information explosion has also appeared in various fields.With the continuous development and update of Internet technology,how to get the information people need accurately and quickly from massive data becomes an urgent and challenging task.Multi-document automatic summarization is a tool that can quickly extract useful information from large amounts of data.This article first analyzes the research status of automatic summarization system at home and abroad,and aims at the defects and deficiencies of the existing topic-oriented automatic summarization system.This paper proposes to introduce viewpoint mining methods into the construction of the theme model,and based on this design.More efficient multi-document automatic summarization system.The main research work of this paper consists of the following three parts:(1)Through the analysis and research of opinion mining techniques,the potential relationship between opinion information and document topics is discussed.Based on the traditional LDA topic model,the defects caused by the hypothesis of LDA model based on the word bag model are analyzed.Improved and designed the topic model based on the combination of opinion word pairs.By abstracting the construction of the document into a text generation process that revolves around the opinion information,and by analyzing the dependency relationship of the sentence,the opinion information that represents the viewpoint entity and its corresponding viewpoint attribute or viewpoint content in the text is extracted word pair combination,designed based on opinion information binary word pairs of opinion topic model OTM(Opinion Topic Model).(2)In the task of automatic multi-document summarization,aiming at the insufficiency of previous work,the model can assist in the modeling of latent topic information by extracting the opinion information of the text,and fuse the multifeatures of text to calculate sentence weights.Detailed settings,including theimportance of the topic,the importance of words,clues,and location information.In the work of extracting abstract sentences,in the traditional redundant control method,the calculation of the semantic similarity between sentences is introduced,and then a document summary with more general ability and higher coverage is generated.(3)Based on the improved topic model and the automatic multi-document summarization method proposed in this paper,experiments were designed and conducted on the Jingdong commentary data set and the DUC2007 data set.Through experiments,it was proved that the opinion topic model constructed in this paper can simulate the text generation process.Good modeling of the topics implied by the document,compared with the traditional method of the topic model built in this paper has a smaller degree of confusion,the subject information obtained by the model is more coherent,more in line with people's subjective judgment;based on the concept of the topic model Multi-document automatic summarization system can generate high-quality summary with high topic coverage and low redundancy.
Keywords/Search Tags:Multi-document summarization, topic model, opinion mining, redundant control
PDF Full Text Request
Related items