With the continuous upgrading of network attack technology,network attacks present a trend of complexity and advancement.Traditional network defense system are not familiar with these new attack means,which resulting in unsatisfactory defense performance.If we can obtain the detailed information of this new type of network attack and find its common characteristics,it can provide key data support for defense decision-making.Threat intelligence can analyze network attacks from various aspects such as methods,techniques and tactics,and plays an important role in effectively defending new attacks.How to use multi-source heterogeneous threat intelligence in defense decision-making is one of the difficulties currently faced,and building a knowledge group of threat intelligence is a possible solution.However,in the face of a large number of multi-source heterogeneous threat intelligence data,how to recognize threat intelligence entities and how to extract the relationship between threat intelligence entities is one of the challenges in the construction of threat intelligence knowledge graph.This paper studies the information extraction of threat intelligence by using the related methods of deep learning,mainly focusing on the entity recognition and relationship extraction of threat intelligence.The main contents and innovations of this paper are as follows :(1)Aiming at the problems of uneven distribution of entity categories and nested entities in network threat intelligence,a threat intelligence entity recognition method based on word pair relationship classification is proposed.This method uses the pre-training model BERT and the bidirectional long short-term memory network to obtain the word vector of the sentence,and constructs a two-dimensional grid of the word pair.Then,a multi-granularity dilated convolution is proposed to refine the representation of word pairs and effectively capture the interaction between close and distant word pairs.Finally,the double affine classifier and multi-layer perceptron are used to jointly predict the word pair relationship and decode to generate candidate entities.The experimental results verify the effectiveness of the model.(2)Aiming at the problems of long distance between related entities in network threat intelligence,error propagation and lack of interaction in pipeline relationship extraction,a joint extraction method of threat intelligence entity relationship based on span is proposed.The method uses the pre-trained model BERT and the bidirectional long-term and short-term memory network to vectorize the words,then detects all the marker spans in the sentence and classifies each span,and finally classifies the relationship by pairing the candidate entities.At the same time,by adding context features and negative sampling,the model fully explores the dependencies between threat intelligence entities and relationships,so as to better recognize threat intelligence entities and classify the relationships between entities.The experimental results verify the effectiveness of the model. |