Research On Entity Recognition Method For The Construction Of Software Engineering Knowledge Graph

Posted on:2022-02-07

Degree:Master

Type:Thesis

Country:China

Candidate:Z K Xu

Full Text:PDF

GTID:2518306740994489

Subject:Cyberspace security

Abstract/Summary:

PDF Full Text Request

Named entity recognition aims to identify phrases that can refer to specific entities from natural language texts.It is the basic task of natural language processing and the automated construction of knowledge graphs.In recent years,building a high-quality software engineering knowledge graph is not only beneficial to accumulate and reuse valuable software engineering experience,but also effectively improve the search and recommendation performance in intelligent software testing and development.Entity recognition in the field of software engineering is software engineering knowledge extraction Important task.Pre-trained language models have been widely used in named entity recognition tasks in general domains.However,software engineering researchers rarely apply pre-training language models to named entity recognition tasks in the field of software engineering.Therefore,the existing software engineering named entity recognition methods have a gap between the recognition effect and general field entity recognition.In addition,due to the scarcity of corpus resources and the difficulty of labeling in the field of software engineering,named entity recognition in the field of software engineering is still a typical named entity recognition problem in low-resource scenarios.To this end,this paper carries out research on entity recognition methods in software engineering.First,the entity recognition model based on the pre-trained neural network is applied to the software engineering entity recognition task.Then,some entity recognition methods are designed for the low-resource problem in the software engineering named entity recognition task scenario,Finally,the corresponding experiment was designed to verify the rationality and effectiveness of the model and methods proposed in this paper.Specifically,the main work of this article includes:1)A software engineering named entity recognition model based on pre-trained neural network is proposed: In recent years,pre-training language models can be selfsupervised training on a large-scale corpus,which can capture the deep semantic features of the text,and then form a high-quality word embedding representation.Therefore,this paper designs a software engineering named entity recognition model based on the pretrained language model BERT.First,this model obtains the basic word embedding representation from the pre-training language model BERT,and makes full use of the deep semantic information provided by the pre-training language model.Then,a character-level convolutional neural network is used to enhance the representation of unseen words,and a bidirectional recurrent neural network is combined to learn the contextual features of the text to form the final sequence feature matrix.Finally,the conditional random field is used to decode the sequence feature matrix to obtain the corresponding output sequence.Experimental results show that the structure of the model designed in this paper is reasonable,and it has an excellent performance in the task of software engineering named entity recognition.2)The methods for recognizing low-resource entities in the field of software engineering are proposed: Aiming at the low-resource phenomenon in the task of software engineering named entity recognition,this paper designs corresponding low-resource entity recognition strategies.First,build a domain knowledge base in the software engineering field by combining external knowledge,and use external knowledge to enhance the performance of the entity recognition model.Then,for the case that a large number of entities in the software engineering named entity recognition data set are not marked,a conditional random field under the condition of incomplete marking is designed.Finally,some loss functions are proposed for the unbalanced distribution of entities in the data set,which improves the effect of software entity recognition when the data is unbalanced.Corresponding experiments on real software engineering-related text data sets are designed in the chapter of experimental evaluation of this paper.The experimental results show that the methods proposed in this paper for low-resource entity recognition in the field of software engineering helps to improve the effect of entity recognition in the field of software engineering.

Keywords/Search Tags:

Named entity recognition, Pre-trained language model, Low-resource entity recognition, Software engineering, Knowledge Graph

PDF Full Text Request

Related items

1	Domain Adaptation Research And Application Of Named Entity Recognition
2	Research On Method And Application Of Named And Terminology Entity Recognition
3	Research And Implementation Of Named Entity Recognition Based On Deep Learning
4	Recognition And Discovery Of Programing Design Network Resource Named Knowledge Entity
5	Research On Named Entity Recognition And Disambiguation Based On Network Semantic Resource
6	Chinese Entity Relation Extraction Based On BERT And Knowledge Verification
7	Software Entity Recognition Method Based On Deep Learning
8	Research On Chinese Named Entity Recognition Based On Deep Learning
9	Research On Bert-based Named Entity Recognition
10	Research On The Construction Of Financial Knowledge Graph Based On Deep Learning