Font Size: a A A

Research And Implementation Of Article Similarity Analysis And Text Summarization In Public Opinion System

Posted on:2020-05-04Degree:MasterType:Thesis
Country:ChinaCandidate:L SunFull Text:PDF
GTID:2427330623463784Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In recent years,China's Internet penetration rate has been rising steadily,and more and more netizens are participating in the occurrence,development and dissemination of online public opinion events.The data of online public opinion have increased exponentially.In the face of a large amount of public opinion data,how to quickly find the public opinion text related to social events and generate summary for people to judge the development of public opinion events is a very meaningful thing.In this context,public opinion system emerged.Public opinion system is an automatic collection and analysis tool of network public opinion information.The system automatically collects public opinion data from the Internet through the network crawler,helps users locate the public opinion text quickly through the public opinion retrieval and monitoring function,combines the data statistics and the text automatic summary function to provide users with the analysis report and text content summary,and helps users comprehensively understand public opinion events.This paper designs and implements a public opinion system for news text.Based on the research of similarity analysis and summary generation technology,the multi-text automatic summary function in the system is mainly realized.The specific work content of this paper includes the following aspects:1)introduce the research progress of text similarity analysis and text automatic summarization technology,and analyzes the advantages and disadvantages of relevant methods;2)research and implement the similarity analysis method of articles based on ALN(Association Link Network).In view of the existence of polysemous words in ALN semantic nodes,to enhance the semantic network's ability to express the semantic information of text,we divide them according to the part of speech,and add a node weight coefficient based on position.Based on WCC(Weighted Community Clustering)metric,we improve the effect of event semantic discovery by adding semantic chain weights to modify its cohesion calculation;3)research and implement the short text abstract model based on Seq2 Seq framework.Hierarchical LSTM encoder structure is adopted to enhance the acquisition of text semantic information to make up for the lack of RNN's ability to learn long text semantic information.Using the attention mechanism and copy mode to solve the OOV problem in the standard Seq2 Seq framework;Modifying the calculation method of attention mechanism and combining coverage mechanism to solve the problem of repetition in attention mechanism;4)propose a multi-document summary generation method based on event semantics.On the basis of LexRank algorithm,by calculating the correlation between candidate summary statements and event semantics,the summary sentences are sorted,and the sentences with the highest score are selected for redundancy elimination to generate multi-document summaries.5)based on the above methods,we implement the automatic summary generation function of multi-news text for news public opinion information.At the same time,the public opinion system is designed and implemented.The main functions include public opinion collection,public opinion retrieval,public opinion monitoring and public opinion briefing.
Keywords/Search Tags:Public opinion system, Association Link Network, automatic text summary, deep learning
PDF Full Text Request
Related items