Font Size: a A A

Research And Application Of News Event Clustering Algorithm Based On Semantic Relationship Graph

Posted on:2020-12-11Degree:MasterType:Thesis
Country:ChinaCandidate:Z K LiuFull Text:PDF
GTID:2428330590995672Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the development of mobile communication technology,the mobile phone application market is showing an increasingly prosperous scene.News applications provide great convenience for users to receive news information.But,the news that updated in real time makes the news received by the users too cumbersome,and the topic of the event is scattered.It is difficult for users to browse the news events of their concern in a short time.Therefore,classifying news as events before pushing will make the news organization more concise and effective,and can improve the browsing efficiency of users.So,it is of great practical significance to conduct clustering research on event granularity of news.The main work is shown as follows:(1)This thesis introduces the background and significance of the research,investigates and summarizes the research status of news event clustering at home and abroad.This thesis introduces the basic process and related technologies of news event clustering,and analyzes the shortcomings of current news event clustering algorithm.(2)Current representation method of news events can only extract the given type event and unable to build an effective semantic environment.In view of the above shortcomings,this paper proposes a sentence-level unsupervised event information extraction algorithm,and designs an event representation algorithm based on semantic relationship graph.Firstly,the semantic unit directly related to the news event is extracted from news.Then,the semantic environment corresponding to the news corpus is constructed,that is,the local semantic relationship graph corresponding to the news is constructed according to the association between the terms in the semantic unit.Finally,the semantic relationship graph corresponding to multiple news items are merged to obtain a global semantic relationship graph,and the semantic relationship graph represents the associations between news.Experiments show that the event representation algorithm proposed in this paper has strong expressiveness and can effectively reflect event relationships and event cluster information.(3)The obtained semantic relationship graph is embedded in the graph,and the relationship between the nodes is transformed into a relationship between the vectors,so that the clustering of the semantic relationship graph is realized by the vector.Aiming at the problem that the current graph embedding algorithm can not accurately capture the cluster information and destroy the high-order structure information,a graph embedding algorithm for reinforcing vertex clusters(GE_RVG)is proposed.In the embedding space,the nodes in same cluster are closer together but the nodes in different cluster are relatively far apart.Aiming at the problem that high-order information is destroyed,a method of constructing global sub-graphs by sampling the original graph is proposed.For the problem that the cluster information cannot be restored in the embedded space,a pseudo-cluster algorithm based on triangle motif is proposed.Cluster information is strengthened in the space after embedding,and the boundaries between clusters are more discriminating.Experiments on data sets such as Polblogs show that our method is superior to DeepWalk,LINE and Node2 Vec.(4)Based on the above research results,a news event clustering prototype system is developed.News event extraction and representation,construction and update of semantic relationship graph,vectorized representation of graph nodes and clustering of news events are designed and implemented.
Keywords/Search Tags:News Event Clustering, Graph Embedding, Event Extraction, Semantic Relationship Graph, Graph Representation
PDF Full Text Request
Related items