Font Size: a A A

Research On Entity Recognition Technology For Knowledge Base Construction In Requirement Engineering Domain

Posted on:2022-02-05Degree:MasterType:Thesis
Country:ChinaCandidate:M D XuFull Text:PDF
GTID:2518306575962309Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the explosive growth of network information,knowledge access and expression have also diversified.However,knowledge expressions mostly appear in the form of texts,which leads to the ways of obtaining the desired knowledge accurately and efficiently has become a hot topic in recent years.Among various ways of expressing text knowledge,the knowledge graph is a relatively scientific and developed method.Its construction is composed of multiple steps,including entity recognition,relationship recognition,entity disambiguation,entity linking,visualization,etc.Among these steps,named entity recognition(NER)is the first step in the structure,and also is the starting point of loss transmission,so the accuracy of entity recognition is particularly critical for constructing the knowledge graph.In the requirements document,a large difference between the actual entity and the entity in the general sense is in their content and length,so it is harder to identify the entity by conventional methods.This paper conducts an in-depth study on the entity recognition model of the requirement document and introduces the deep learning method of first segmentation and then labeling for named entity recognition.When performing text segmentation,by observing the data,we proposed a construction method for entity segmentation by identifying non-entities to reverse segmentation,and fine-grained optimization,and combined with the deep residual network(Res Net)to design a more accurate word segmentation model;When performing sequence labeling,a bidirectional long-term short-term memory network(Bi LSTM with attention)with attention mechanism and a conditional random field(CRF)method are used for experiments,and a modified selfattention mechanism is introduced to realize hybrid named entity recognition model.Finally,in order to solve the difficulty of effectively annotating the small amount of data in the requirements document,an active learning method was introduced,which significantly reduced the data requirement without sacrificing accuracy and recall.At the same time,we verified the fault tolerance rate of the active learning model through experiments.Experiments show that the proposed methods have better recognition effect than the traditional method.
Keywords/Search Tags:Named entity recognition, Deep learning, word segmentation, grammatical regulation, Active learning
PDF Full Text Request
Related items