Font Size: a A A

Research On Key Technologies For Construction And Application Of Threat Intelligence Knowledge Graph

Posted on:2021-04-27Degree:DoctorType:Dissertation
Country:ChinaCandidate:T LiFull Text:PDF
GTID:1368330647957273Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Cyber threat intelligence,a big data in cybersecurity,is essentially the summary and analysis of potential threat information from the attackers' perspective,which is used to better improve their cybersecurity control by the defenders.Currently with increasingly severe cybersecurity situation,the value of cyber threat intelligence for cybersecurity protection becomes more and more prominent.As a typical representative of successful application of knowledge engineering in the era of big data,knowledge graph is an important branch in artificial intelligence field.Knowledge graph is essentially a large-scale semantic network,which intuitively describes various concepts,entities,attributes and semantic relations in the objective world in the form of graph structure.Knowledge graph contains two remarkable characteristics: 1)achieving the association and fusion of multi-source heterogeneous data;2)achieving accurate semantic retrieval and intelligent knowledge reasoning.Such characteristics of knowledge graph can perfectly meet the technical needs of big data mining and analysis.Nowadays,knowledge graph has been developed into a set of technologies for big data processing mining.Knowledge graph is applied into the field of cyber threat intelligence in this dissertation.With the unstructured cyber threat intelligence data,we focus on the key technologies involved in construction and application of threat intelligence knowledge graph.The methods for knowledge extraction aimed at textual threat intelligence and knowledge reasoning oriented to threat intelligence knowledge graph are emphatically discussed.and the technical systems applied in the field of large-scale cyber threat intelligence are combed.The main contributions of this dissertation are as follows:1.A multi-feature fusion-based method for entity extraction from threat intelligence is proposed.In the process of automatically extracting threat-related knowledge from unstructured cyber threat intelligence,entity extraction is a basic task,which aims to recognize the specific entity categories,including software,malware,vulnerabilities,attack tools,attack patterns,etc.To solve the problem that the current end-to-end entity extraction system based on neural network cannot accurately label the specific entity category and its boundary when applied for cyber threat intelligence,we consider fusing multi-features of entities words,like word features,character features,entity boundary features and context features of entity words,modelling the problem as a sequence labeling task.Then,we design an encoding-decoding framework based on deep learning model and attention mechanism,which can recognize entities in cyber threat intelligence more accurately and also improve the training speed of the model.2.An entity relationship extraction method of threat intelligence based on the enhanced semantic feature is proposed.Aiming at establishing semantic association for threat intelligence entities,the semantic relationship between threat intelligence entities is extracted from unstructured network threat intelligence,which is transformed into the relationship classification.Considering the limitations of the end-to-end entity relation extraction system in obtaining entity semantic relation information,an adversarial learning mechanism is introduced to enhance the semantic features describing entity relations.On this basis,the encoded information of semantic relationship between entities is sent to multi-classifier for training.Finally,an entity relationship extraction system for sentence-level cyber threat intelligence based on supervised learning is obtained.3.A knowledge triple extraction method for textual threat intelligence combined with adversarial active learning is proposed.Aiming at the problems existing in the semantic relation extraction system of threat intelligence entities at sentence-level in practical efficiency and overlapping relation acquisition,a joint extraction scheme for threat intelligence entities and relations is designed.We introduce a new labeling strategy,modeling the joint extraction of entities and relationships as a sequence labeling task.Then an encoding-decoding framework based on deep learning is proposed,in which a dynamic attention mechanism is introduced to better capture the dependencies among words in the sequence.This method can be applied to directly obtain entity-relation semantic triples for paragraph-level threat intelligence.In addition,in view of the lack of labelled data in the joint extraction method,an adversarial active learning algorithm is proposed.By comparing the semantic similarity of data,the training samples to be labeled are selected,and the performance of the model is improved at a lower labeling cost.4.A hybrid knowledge reasoning method based on reinforcement learning and graph convolutional network is proposed.Aiming at the problem that implicit knowledge cannot be obtained directly by semantic retrieval in threat intelligence knowledge graph,a novel knowledge reasoning approach is designed,aiming at obtaining the implicit relation among threat intelligence entities and realizing relational reasoning.Specifically,combining with the current practice of knowledge reasoning based on reinforcement learning and graph convolutional network on knowledge graph in general domain,an adversarial learning framework is designed to realize knowledge reasoning based on the integration of reinforcement learning and graph convolutional network.
Keywords/Search Tags:Cyber Threat Intelligence, Knowledge Graph, Knowledge Extraction, Entity Recognition, Relation Extraction, Knowledge Triple Extraction, Knowledge Reasoning
PDF Full Text Request
Related items