Font Size: a A A

Research On Key Technologies Of Entity Linking Toward Electromagnetic Space Domain

Posted on:2024-04-20Degree:MasterType:Thesis
Country:ChinaCandidate:H B CaiFull Text:PDF
GTID:2530307052995859Subject:Electronic information
Abstract/Summary:PDF Full Text Request
In the electromagnetic space domain,the update of electronic equipment parameter information and the change of its carrying platform need to rely on the information provided by various departments.With the development of Internet technology,intelligence updates are faster and faster.It is unable to meet the real-time requirements under the modern military confrontation environment to summarize and sort out the entities in intelligence manually.Although the corresponding information extraction technology has been put forward,the information from different departments is inconsistent with the expression of the same entity,which brings difficulties to the management and application of information.Entity linking technology is used to map the entity references in the text to the only entity in the domain knowledge graph,which is one of the core basic work to update the domain knowledge graph.In addition,entity linking can also map electronic equipment or carrying platforms in intelligence to entity nodes in the domain knowledge graph,and assist experts to analyze and judge unknown electromagnetic signals in the battlefield through the relationship between nodes,which is of great significance to national defense and military work.In the face of the above problems and challenges,this thesis studies the entity linking technology toward the electromagnetic space domain.The main contributions are as follows :(1)The knowledge graph of electromagnetic space domain is constructed.Using the web crawler and web page analysis technology,the triplet of relevant domain entities is extracted from the local confidential intelligence data source and the open domain English Wikipedia knowledge base,and the ontology definition of the domain knowledge graph is completed in a bottom-up manner.The knowledge of the two sources is used to construct the final domain knowledge graph.(2)An entity linking dataset in the electromagnetic space field for Internet opensource intelligence is constructed.Select the blog posts of users in related fields on Weibo for crawling.After data preprocessing,the domain entity mentions in the text are labeled to the entities in the knowledge graph,and the construction of the entity linking dataset is completed.(3)In order to get rid of the dependence of entity linking tasks on priori features such as disambiguation tables and entity popularity,and make entity linking models pay more attention to the features of entities in the knowledge graph,this thesis proposes an entity linking model ELMTCL(Entity Linking Model based on Typesupervised Contrastive Learning).ELMTCL is based on the BERT pretraining language model,which uses BERT to encode entity mentions and entities in the Knowledge graph into vectors respectively,and draws closer the similarity between mentions and correct entities through contrastive learning.In order to improve the semantic encoding ability,ELMTCL adds the fine-grained entity type information of entities in the knowledge graph as another label to the loss function of contrastive learning.At the same time,ELMTCL also uses the global hard negative sample technology to enable entities that do not appear in the dataset to participate in model training and improve the generalization ability of the model.Finally,on the dataset in this field,ELMTCL is compared with the traditional method and the SOTA model of open domain entity linkage,which proves the effectiveness of ELMTCL method.(4)In order to solve the problem of model deployment on edge devices,this thesis proposes a lightweight entity link model Tiny-ELMTCL,and uses results oriented knowledge distillation and intermediate feature oriented knowledge distillation methods respectively to distill the knowledge in ELMTCL to Tiny ELMTCL,which improves the reasoning speed of the model to 7.15 times of the original model,reduces the scale to only 11% of the original model’s parameters,and has 96% of the original model’s link accuracy.The distillation effects of lightweight models with different structures were also compared,which proved the effectiveness of the Tiny-ELMTCL structure.
Keywords/Search Tags:Entity Linking, Electromagnetic Space, Supervised Contrasting Learning, Global Hard Negative Sampling, Knowledge Distillation
PDF Full Text Request
Related items