Research On Few-shot Named Entity Recognition Technology For Threat Intelligence

Posted on:2024-07-08

Degree:Master

Type:Thesis

Country:China

Candidate:W M Yang

Full Text:PDF

GTID:2568307079471924

Subject:Electronic information

Abstract/Summary:

PDF Full Text Request

Cybersecurity attacks are proliferating,and organizations such as governments and enterprises are frequently threatened by security incidents.Threat intelligence unstructured text contains physical information about attack events,and the extracted entities can be used to strengthen response means and shorten defense response time.In recent years,deep learning models have achieved good results in the task of named entity recognition in the threat intelligence domain,while deep learning models rely on a large amount of annotated data,which is costly in the threat intelligence domain,and a small amount of annotated data makes it difficult for the models to achieve satisfactory results.Aiming at the above problems,this thesis improve the named entity recognition model in the few-shot learning threat intelligence domain from both model architecture and multi-view learning,and propose a series of solutions to address the above problems.(1)For the problem of small amount of annotated data in the threat intelligence domain,it is difficult for a single few-shot learning model to fully learn the data features,this thesis proposes a new Few-shot Threat Intelligence Named Entity Recognition Model(FTM).The FTM model is co-trained with the prototype network,the pre-training model and the self-training model by the Tri-training algorithm,and exploits the complementary nature of the three different model views to capture more threat intelligence domain knowledge at the encoding level.In experiments conducted on the threat intelligence dataset,the F1 score of the model test in the 5-way 10-shot training scenario is 44.56%,which is at least 8.69% better than any single internal model,and the FTM model outperforms other single or joint models in all three few-shot scenarios.(2)To address the redundancy of encoding information in the multi-view learning process of the FTM model proposed in this thesis,which affects the weight of features in the loss and leads to training errors,this thesis simplifies the structure of the Gate Recurrent Unit(GRU)model.Based on the improved GRU structure,the FTM model is changed to a three-view fusion approach,and the Few-shot Threat Intelligence Named Entity Recognition Model Based on Improved GRU Fusion(FTM-GRU)is proposed.GRU gating units and view correlation calculations determine to memory and forget of threat intelligence features and highlight important semantic features.Experiments are conducted on two threat intelligence datasets,and the 5-way 10-shot training scenarios improve the F1 values by 4.56% and 4.29%,respectively,relative to the FTM model,and the FTM-GRU model performance metrics outperform the FTM model in other few-shot scenarios.(3)Faced with a large amount of unstructured threat intelligence data,there is a lack of a tool that can easily and quickly extract threat intelligence named entities.Based on this background,a threat intelligence named entity recognition system is constructed,which integrates the few-shot threat intelligence named entity recognition model proposed in this thesis to realize batch operations on unstructured text and effectively improve the efficiency of threat intelligence named entity recognition.

Keywords/Search Tags:

threat intelligence, named entity recognition, few-shot learning, multi-view learning, named entity recognition system

PDF Full Text Request

Related items

1	Research And Implementation Of Chinese Named Entity Recognition Based On Deep Learning
2	Research On Unstructured Threat Intelligence Entity Extraction Method Based On Machine Learning
3	Chinese Named Entity Recognition Based On Neural Network
4	Research On Complex Entity Recognition And Class Increment Problem In Named Entity Recognition
5	Research On Algorithm And System Implementation On Named Entity Recognition For Chinese Electronic Medical Records
6	Candidate Region Aware Nested Named Entity Recognition
7	Named Entity Recognition With Multi-Grained Representation Learning
8	Research And Implementation Of Named Entity Recognition Based On Deep Learning
9	Research On Named Entity Recognition Algorithm And Its Implement In Specific Fields
10	Research On Named Entity Recognition Based On Deep Learning