Font Size: a A A

Research And Implementation Of Text Summarization Technology In Public Opinion Monitoring System

Posted on:2019-10-24Degree:MasterType:Thesis
Country:ChinaCandidate:W J YangFull Text:PDF
GTID:2428330590492472Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the explosive growth of network data,it has become increasingly difficult for infor-mation to be acquired quickly and used effectively.Automatic text summarization technology is an important means to solve the contradiction between explosive growth and effective use of information,and has received extensive attention.It is committed to presenting the core content of the original text to user in the most concise manner,thereby improving user access and utilization of information and supporting other text processing technologies.At the same time,the public opinion monitoring system(POMS)is an important tool for modern information acquisition analysis.It can automatically crawl a large amount of informa-tion on the network,and then make statistical and summative analysis of this information.The mature POMS generally requires the participation of automatic text summarization technology in the text analysis stage.The succinctness,accuracy,and clarity of the generated summaries can effectively improve the efficiency of text retrieval and text similarity calculations in the POMS.And the generated summaries can provide core information content for generating event summary.Based on the application background of the POMS,we focuses on the automatic text technology,proposes a single-text summary model and event-based text clustering method.Then we combined with the comprehensive application of these two methods,to design and implement the opinions gathered and generating events summary function of the event analysis module in POMS.Firstly,we design and implement a neural abstractive summarization model based for single document.In order to discover the potential dependency relationships among the original sentences,we uses the hierarchical encoder to transform the original sentences into a latent semantic vector representation that can be calculated,then use the self-attention mechanism to mine the potential dependency relationship.For the OOV problem,we adopt attentional sampling and copy mechanism to generate token.Comparative experiments on the LCSTS dataset show that the model is feasible and effective.Secondly,faced with the need about that the texts of event analysis module should describe the same event in the POMS,we proposes a text event clustering method based on ALN and WCC.According to the characteristics of public opinion texts which describe the same event,the ALN is used to construct network from the original texts,and then community discovery operation is performed.To solve the problem that different communities can share the same node,we adopt a node update strategy based on WCC,and use the cosine similarity for community consolidation.Finally,we design and implement the opinions gathered and generating events summary function in the event analysis module of the POMS.Using the sentence vector in the single-text summarization model as a representation of text,the x-medoids algorithm is used to complete the clustering operation based on the similarity of text content,aiming to mine the different sub-aspect views in the event.Then,the generated abstract text of each cluster center is extracted as the viewpoint description text,and is spliced into the event summary text according to the cluster size.
Keywords/Search Tags:automatic text summary, deep learning, public opinion system, community found, text clustering
PDF Full Text Request
Related items