Font Size: a A A

Passage Level Event Representation And Relevance Computation

Posted on:2020-07-03Degree:MasterType:Thesis
Country:ChinaCandidate:Y T LiuFull Text:PDF
GTID:2428330590473217Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Event extraction has always been a hotspot in academic research.The core information of a conversation or article is often one or more events.Therefore,event extraction can provide key information and important features for clustering,recommendation,reasoning and other tasks.With the continuous development of Internet information flow,the main source of people's access to information is no longer active search,but passive recommendation.Jinri Toutiao,Tiantian Kuaibao and other products which rely on Personalized Recommendation news to enhance user stickiness are constantly emerging.Such content recommendation usually pushes articles with the same or similar events on the basis of analyzing the events that users are interested in.First of all,it is necessary to extract text-level events and classify them accordingly.At present,the main research direction of event extraction is sentence-level event extraction.There are few studies on text-level event,which has high research value.In this paper,we extract the key words of text-level events and calculate the correlation of events from the two aspects of headlines and articles,aiming at the Tiantian Kuaibao news corpus.This paper uses the method of short text similarity to calculate the similarity of headlines of two news articles,which is the first index of news event correlation calculation.For the sake of completeness of reporting and fluency of reading,a news often covers the historical information,related events and related persons besides the core events,which are redundant information.In order to extract the core events more accurately,after screening the key information of the article,this paper extracts the key event words,and calculates the event correlation of the article,which is the second index of news event correlation,based on this.The result of weighted summation of two event correlation indicators is taken as the final news event correlation.For the article,firstly,the text Rank algorithm based on EM is used to extract key sentences and calculate the importance of words in the article.Then,we use the ZORE triple extraction model to extract the event information at the sentence level of key sentences.Based on the results of word importance calculation and sentence level event extraction,this paper constructs event connectivity graph and uses Text Rank algorithm to extract key event words.In order to solve the problem that verbs have different meanings in different contexts and maximize the use of graph structure information,this paper uses Verb Net-based word vector adjustment model and Graph Embedding model to adjust the key event word vector.Finally,this paper proposes a simple model to calculate the relevance of text events based on the word vectors of key event words,and calculates the relevance of the two articles.For headlines,this paper uses pre-trained Seq2 Seq model to vectorize news headlines.GRU model is used in both Encoder and Decoder parts of the model,and Attention operation is added between Encoder and Decoder.The cosine similarity of two news headlines vectors is taken as the event correlation of headlines.In this paper,the experiment is carried out on the manual annotated Tiantian Kuaibao news data set,and it meets the requirements.
Keywords/Search Tags:passage event connected graph, text detection, Text Rank, Graph Embedding, Seq2Seq
PDF Full Text Request
Related items