Font Size: a A A

Research On Entity Extraction Technology For Network Security

Posted on:2022-08-16Degree:MasterType:Thesis
Country:ChinaCandidate:S H ChenFull Text:PDF
GTID:2518306569972719Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
With the development of Internet technology,cyberspace security situation is becoming increasingly severe,seriously affecting national security and social stability.The construction of knowledge map in the field of network security can mine valuable security information from massive fragmented network security data,and further help decision-makers analyze network security events.Among them,entity extraction technology is an important part of knowledge mapping construction.However,the existing Chinese data sets related to entity extraction technology in network security field are rare,the definition of ontology model is not comprehensive enough,and the effect of entity extraction still needs to be improved.Therefore,this paper mainly studies the entity extraction technology in the network security domain knowledge mapping construction technology:First,the ontology model in the network security domain is designed.We refer to the structured threat intelligence Expression STIX2.0 and the unified network security ontology UCO1.0,and combine entity categories of the actual network security text to construct network security domain ontology including attack mode,defense measures,malware,software,attacker,vulnerability and so on.Second,collect a large amount of text data in the field of network security and construct related data sets to solve the problem of the scarcity of data sets for entity extraction tasks in the field of network security.We built a crawler to automatically crawl and clean the network security text.At the same time,we built a network security dictionary.Then,we manually annotated the data based on the network security ontology model,building about 10,000 Annotated corpus training data.Third,the Security BERT-SWLE-Bi LSTM-Att-CRF entity extraction model is proposed.First,the model is pre-trained in the field of network security,and the Security BERT model is obtained.Secondly,a network security word-level enhancement(SWLE)method is proposed,which integrates word-level information to improve entity boundary determination and entity recognition capabilities.Then introduce Bi LSTM model and self-attention mechanism to capture contextual semantics and enhance local key information.Finally,the CRF model is used for sequence annotation to realize network security entity extraction.Fourth,in order to solve the problems of multiple label classification and entity recognition inaccuracy in single task entity extraction,this paper uses the idea of multi task learning for reference,and proposes a multi task joint learning model based on the single task model.The model performs entity segmentation and entity category judgment tasks at the same time,and finally performs label fusion in the output layer.Experimental results show that compared with other traditional methods,the model achieves the best results in accuracy,recall and F1 value,and effectively extracts relevant entities from network security text data.
Keywords/Search Tags:Network security, Entity extraction, Security ontology, Word level enhancement, Multi-task learning
PDF Full Text Request
Related items