Font Size: a A A

Research And Application Of Hotspot Discovery Technology For Network Events

Posted on:2024-08-25Degree:MasterType:Thesis
Country:ChinaCandidate:W H HeFull Text:PDF
GTID:2558307079472254Subject:Electronic information
Abstract/Summary:PDF Full Text Request
In the era of new media,the discovery of online event hotspots is important for personal information acquisition,enterprise promotion and maintaining the stability of social opinion.To this end,this thesis conducts a research on the discovery of hotspots of online events based on text clustering and text summarization techniques around the problem of obtaining hotspot information quickly and accurately in the vast amount of label-free text data exploding on the Internet today.The main innovations and work tasks of this thesis are as follows.(1)A hotspot discovery method with multi-stage clustering is proposed.It addresses the problems of sparse sentence embedding representation,high feature dimensionality,ineffective similarity measures and unknown k values of traditional KMeans clustering methods in the face of short texts.The method firstly selects Elastic Search for the initial stage statistical clustering to provide k values for the KMeans clustering algorithm,and introduces the unsupervised Sim CSE method in the deep clustering.Secondly,the embedding model is fine-tuned together with the SSKU algorithm while performing data augmentation to improve the problem of similarity metric failure and improve the embedding expression of the model,and finally the embedding results are optimized using the UMAP streamwise dimensionality reduction method to combat the problem of feature sparsity and high latitude of embedded features existing in short texts.The experimental results show that the proposed hotspot discovery method of multi-order clustering has good performance on different datasets.(2)A fused topic-based hot events summarization method is proposed.In response to the problems that the traditional extractive summary method is not sufficient for the mining and utilization of text features,which leads to the method easily extracting sentence information that is not related to the topic in the text as the summary and has poor relevance,this thesis selects the BTM topic model as the topic extraction method,uses the SBERT model to encode sentences and topics,and uses the attention mechanism to calculate the relevance of sentences and topics to fuse the The embedded features of topics are used to construct the summary judgment model.The experimental results show that the hot event summarization method proposed in this thesis can enhance the summarization judgment ability of the model by calculating the correlation between topics and sentences,and the final extracted sentences contain more topic information,which improves the performance.(3)Designed and implemented a web event hotspot discovery system.The main work is to design and implement a web event hotspot discovery system based on multiorder clustering framework and event summary model in terms of requirement analysis,system design and system implementation.Through system testing,the stability and reliability of the system is verified,and it is integrated into the actual project as a subsystem to provide topic discovery and hotspot discovery functions for the project.The multi-stage clustering method and event summarization method proposedin this thesis both achieve good results on public datasets,and the pipelined combination of the above two methods achieves hotspot discovery of web events on massive unlabeled data,which can be applied to scenarios such as intelligent analysis of public opinion and recommendation of scientific and technical literature.
Keywords/Search Tags:Text clustering, Deep clustering, Text Summarization, Extractive Sum marization, Hotspot Discovery
PDF Full Text Request
Related items