Font Size: a A A

Research On Entity Relationship Extraction Technology Based On Deep Learning Fusion Of Text Features

Posted on:2022-05-22Degree:MasterType:Thesis
Country:ChinaCandidate:C F RanFull Text:PDF
GTID:2518306575466534Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the explosive growth of data,a large amount of knowledge has been born on the Internet.In order to facilitate the management of knowledge,the concept of knowledge graph has been put forward.However,due to the diversity of industries,the way of constructing knowledge graph is not uniform.Pneumonia outbreak,people are paying more attention to health.Diabetes is one of the important complications affect health,so this thesis with text data as the object,in the field of diabetes research entity recognition model based on fusion deep learning text characteristic and relationship extraction model,aimed at random from the text of the extracted information structure,and build a knowledge extraction system in the domain of diabetes.The main research contents of this thesis are as follows:1.Aiming at the difference of text features in different fields,a multi-feature entity recognition model is designed in this thesis.Firstly,in the word embedding layer,a multifeature word embedding algorithm is proposed,which integrates pinyin,radical and the meaning of the character itself,so that the word embedding vector has the characteristics of Chinese characters and the characteristics of diabetes text.Then in modeling,CNN and BiLSTM are used to extract the local and global features before and after the text sequence respectively,which solved the problem that the traditional method can not capture the dependence before and after the text sequence.Finally,CRF is used to output the predicted tag sequence.The experimental results show that the multi-feature embedding algorithm and local features extracted by CNN can effectively improve the recognition effect of entity recognition model.2.Aiming at text features at the article level,this thesis designs a relational extraction model that integrates text features.Firstly,in terms of corpus processing,considering that the entity pairs with relation may cross sentences in the text,a new method of sample construction combining text features is proposed.Secondly in the feature representation,the character meaning feature,pinyin feature,radical feature,entity type feature and local feature extracted by CNN are introduced.Then,the double-layer BiLSTM with residual connection is used to fully extract the global context features of the text sequence,and the attention mechanism is combined to improve the focus of the model on the key words.Finally,Softmax is used to output the relationship prediction results.The experimental results show that the two-layer BiLSTM structure and the sample construction method combining text features can effectively improve the extraction effect of the relational extraction model.3.Based on the entity recognition model and relation extraction model proposed in this thesis,a knowledge extraction system for the field of diabetes is designed and implemented.The system is divided into three parts,namely data acquisition,data middle platform and knowledge extraction,to realize knowledge extraction and storage as well as visual display of knowledge.
Keywords/Search Tags:deep learning, text features, knowledge graph, entity recognition, relationship extraction
PDF Full Text Request
Related items