Research On Network Security Threat Intelligence Mining Key Technologies For Open Source Data

Posted on:2024-04-24

Degree:Master

Type:Thesis

Country:China

Candidate:S H Cheng

Full Text:PDF

GTID:2568307127954699

Subject:Computer technology

Abstract/Summary:

As a comprehensive,accurate and structured network security information,network security threat intelligence provides security early warning and decision support through information sharing,which is the basis and premise of security defense work.At present,network security threat intelligence widely exists in open source unstructured data such as APT reports and security blogs.The important information such as threat entities and relationships contained in it is difficult to be used directly,and there are many fake contents.In this regard,this paper focuses on the mining technology of network security threat intelligence for open source data.The main work is summarized as follows:(1)Aiming at the problem that the unstructured information in the network security threat intelligence text is difficult to be fully utilized and the training samples are scarce,a Threat Intelligence Entity Relation Extraction(TIERE)method for few samples is proposed.Firstly,according to the characteristics of high complexity and strong professionalism of the open source network security analysis report,a data preprocessing method is studied and proposed to improve the analyzability of the text;then,a Name Entity Recognition based on Improved Bootstrapping(NER-IBS)algorithm and a Relationship Extraction based on Semantic Role Labeling(RE-SRL)algorithm is proposed,the proposed algorithms use a small number of samples and rules to construct the initial seed,through multiple iterations to mine threat entities,and extract the relationship between entities through the strategy of constructing semantic roles;finally,use the STIX2 format specification to standardize the extracted entities and relationships.Experimental results show that the TIERE method can effectively mine threat intelligence entities and their relationships in a few-shot environment.(2)Aiming at the shortcomings of fuzzy classification,similar context and uneven distribution in network security threat intelligence entities,a Threat Intelligence Entity Identification(TIEI)method is proposed.First,preprocess the network security threat intelligence text,and convert the long text with more redundant information into a streamlined word sequence;then,research and propose a Machine Reading Comprehension with Specialized Knowledge(MRC-SK)model,which uses the attention mechanism to learn additional professional knowledge,and predicts the start index and end index of the answer corresponding to the question in the original text through the pointer network;finally,in order to alleviate the impact of the uneven distribution of entities on the recognition results,in In the model training phase,the Dice loss function is used instead of the cross-entropy loss function.Experimental results show that the TIEI method can effectively mine threat intelligence entities with ambiguous classification and similar context.(3)In order to effectively identify fake information in open source network security threat intelligence,a Fake Threat Intelligence Mining(FTIM)method is proposed for existing data poisoning attack methods.First,research and propose a Generation of Fake Threat Intelligence Based on GPT-2(GFTI-GPT)model,which uses general data pre-training,uses network security threat intelligence text for fine-tuning,and Randomly generate fake information through Top-P sampling;then,simulate the data poisoning attack method,analyze the impact of false information on real network security data sources,and mark the real information and fake information to construct a sample data set;Finally,combined with the BERT pre-training model,the Classification of Fake Threat Intelligence Based on Text CNN(CFTI-TC)model is studied and proposed,which can effectively identify fake network security threat intelligence.Experimental results show that the FTIM method can effectively identify fake information in open source data.

Keywords/Search Tags:

Threat intelligence, Entity Recognition, Relation Extraction, Data Poisoning, Fake Intelligence Mining

Related items

1	Research On Unstructured Threat Intelligence Entity Extraction Method Based On Machine Learning
2	Research On Parallel Mining Technologies Of Threat Intelligence For Internet Big Data
3	Research On Threat Semantic Recognition And Sharing Based On Multi-source Threat Intelligence
4	Research On Network Threat Intelligence Information Extraction Based On Deep Learning
5	Research And Application Of Threat Intelligence Knowledge Graph Construction Method For Unstructured Data
6	Research And Implementation Of Information Extraction Model For Cyber Threat Intelligence
7	Research On Extracting Threat Intelligence Information Based On Pre-trained Language Models
8	Research On Key Technologies For Construction And Application Of Threat Intelligence Knowledge Graph
9	Research On Key Technologies Of Knowledge Extraction For Chinese Threat Intelligence
10	Research On Key Technologies For Construction And Application Of Cyber Threat Intelligence Knowledge Graph