Font Size: a A A

Research And Application Of Microblog Events Abstract Generation And Evolution Analysis Technology

Posted on:2020-12-28Degree:MasterType:Thesis
Country:ChinaCandidate:S JiangFull Text:PDF
GTID:2428330596976080Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Now Weibo platform has become an important medium for the dissemination of realtime information.When a hot event occurs,Weibo platform,such as Twitter,will generate a large number of tweets related to the event in the first time,and integrate into Weibo.Among the massive information.Due to the low density and high redundancy of Weibo data,it is difficult for users to quickly and accurately understand the occurrence and evolution of the hot event through search and browse.Therefore,in the massive microblog data,how to quickly grasp the evolution process of a hot event and present it to users in a concise summary form has become a research hotspot in the field of social network analysis.However,the shortness,irregularity and large scale of Weibo data make the traditional topic detection and tracking technology suitable for long text(such as news reports)no longer applicable.Based on the Twitter platform,this paper proposes an evolution analysis and abstract extraction method based on Weibo events.The method presents a summary of the evolution stages of event evolution in a timeline format.The main research work of this paper is as follows:Firstly,an evolutionary phase detection algorithm based on keyword co-occurrence graph is proposed.As the dynamic development of Weibo events,different stages will evolve.Therefore,the algorithm takes the tweet data stream of Twitter event as input,constructs a keyword co-occurrence graph based on the keyword and its co-occurrence relationship;then obtains the keyword community based on the overlapping community partitioning algorithm,one of which corresponds to an evolutionary stage,and then pushes The text data set is based on the keywords in the community to perform document clustering to obtain the evolution stage tweet cluster,that is,the evolution stage detection is completed.The experimental results show the reliability of the algorithm,which can generate higher quality input for the subsequent summary extraction.Secondly,a microblog content summary degree scoring method based on the synergistic effect of words and sentences is proposed.The method calculates the degree of generalization of the content of each microblog content in an event by calculating the similarity between the contents of the microblog text,thereby obtaining the generalization score of the microblog content.The generalization scoring method is mainly based on the mutual influence between words and phrases:(1)the words in the high scoring microblog should have higher weight;(2)the scoring of the microblog containing more high weight words should be higher.The final Weibo content summary score is obtained by iterative convergence of multiple words and phrases.Experiments show that this method can obtain a summary set of evolutionary stages with better effects.Thirdly,a microblog event summary extraction method based on comprehensive scoring is proposed.This method proposes a comprehensive scoring algorithm based on the propagation characteristics of microblog data(the number of fans of the distributor,the number of friends and the number of microblogs)and the text features,that is,the importance of a microblog is transmitted from its influence and content.The degree of generalization and the proportion of characteristic words are measured in three aspects.Then,based on the evolution stage microblog collection,the top-k microblog is dynamically extracted as the evolution stage summary;combined with the maximum edge correlation algorithm,the evolution stages of each evolution stage are redundantly processed,and the development of the entire Weibo event is displayed according to the time axis.process.Experiments show that the algorithm achieves good results in abstract extraction.
Keywords/Search Tags:natural language process, evolutionary phase detection, community division, event summary, keyword co-occurrence map
PDF Full Text Request
Related items