Font Size: a A A

Research On The Application Of Text Classification And Clustering In Network Secutiry Operation System

Posted on:2021-04-05Degree:MasterType:Thesis
Country:ChinaCandidate:W Y WangFull Text:PDF
GTID:2428330632963022Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
With the development of information technology,the network has become an indispensable part of everyone's daily life.More and more enterprises and organizations have begun to pay attention to network security issues,hoping to secure network access through network security operation management systems.The current network security protection is not limited to passive defense mechanism,customers also hope that the security operation system can be more proactive in real-time protection to prevent problems before they occur and minimize risks.In this thesis,data mining algorithms such as text classification and text clustering in machine learning are applied to the network security operation system.On the one hand,it implements website classification,which facilitates the system to conduct in-depth analysis of users' network access logs.On the other hand,it discoveres hot topics in the network,facilitates the system to summarize and analyze network security information timely,and takes active measures to improve the efficiency of daily management and network security protection work.Website classification is an important application in security operation system,and its essence is to classify text data on web pages.For the problem of website classification,the classification accuracy of the classical text classification algorithm is not high and the scalability is not strong,while the deep neural network algorithm relies heavily on the huge training set and powerful computing hardware.Based on the deep forest,this thesis improves the feature representation and information extraction of the original text with the help of word embedding model and n-gram idea,so that the improved deep forest is more suitable for the text classification of websites,leading other classification models in efficiency of opretion and accuracy of results.Hot topic detection is also an important application in security operation system,and its essence is to cluster network text.Due to the high dimension and sparsity of network text features,direct clustering with traditional text clustering algorithms requires a large amount of memory space and computing resources.Meanwhile,due to the information redundancy inevitably exists in high-dimensional text features,the quality of direct clustering is also unsatisfactory.In this thesis,competition mechanism is added on the basis of deep autoencoder,so that it can better extract features and reduce dimensions of text data.The re-clustering of text features after dimension reduction not only improves the clustering efficiency,but also greatly improves the clustering quality.
Keywords/Search Tags:text classification, text clustering, internet security, deep forest, autoencoder
PDF Full Text Request
Related items