Font Size: a A A

Topic Detection And Tracking For Network Information Security

Posted on:2021-02-16Degree:MasterType:Thesis
Country:ChinaCandidate:L LinFull Text:PDF
GTID:2518306308468924Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
Network information security issues are increasing,and people are paying more and more attention to information security content on the network.Therefore,there is a need for a technology capable of effectively organizing and analyzing a large amount of network information content,and after accurately refining the content of network information security,communicate it to users.TDT technology came into being,it can detect new topics in information security texts in the network information flow,and can continuously track information on known security topics,so that people can understand the full details of an information security event as a whole,and finally make the internet information more effective and accurate for users.First,this thesis describes the related technology of TDT,including text segmentation,text representation model,text feature extraction,vector weight calculation,similarity calculation and text clustering.Then,this paper analyzes the characteristics of information security texts that are different from traditional TDT corpora,and in view of their characteristics and the shortcomings of related algorithms,the related filtering methods and keyword algorithms are improved during the text feature processing stage,which effectively improves the accuracy of subsequent text processing.The traditional topic detection algorithm based on hierarchical clustering is prone to clustering effects and affects the realization effect.Therefore,based on hierarchical clustering,a new topic detection algorithm is proposed and implemented,which combines vector weighted calculation and two-step clustering method.Experiments show that the algorithm increases the accurateness of topic detection and reduces the false detection value.Traditional topic tracking algorithms are prone to topic drift and errors in similarity calculations.Therefore,based on traditional topic tracking,a new topic tracking algorithm is proposed and implemented.It combines the method of dynamically updating the topic vector model and the method of reconciling the average similarity calculation.Experiments show that the algorithm improves the accuracy of topic tracking and reduces the false detection value.Based on topic detection and topic tracking,this thesis implements ranking and importance analysis of hot topics.The related algorithm implements the output of representative text according to the timeline,which helps people better understand and analyze popular topics.
Keywords/Search Tags:TDT, information security, keywords extraction, similarity calculation, text clustering
PDF Full Text Request
Related items