Font Size: a A A

Domain-Oriented Entity Recognition And Relationship Extraction Design And Implementation

Posted on:2020-07-03Degree:MasterType:Thesis
Country:ChinaCandidate:X GaoFull Text:PDF
GTID:2428330596475459Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the continuous development of the Internet and the proliferation of network data,the information contained in the data on the Internet has also exploded.How to quickly and accurately extract knowledge from massive data and apply the extracted knowledge to various fields has become a hot topic of current research.At present,there are many research results on the extraction and entity identification of English implicit relationships,but Chinese research is very lacking.To this end,this thesis designs two models for named entity recognition and relationship extraction for Chinese research.The traditional feature-based method is mature and the space for improvement is limited.In order to further improve the automation and performance of the models,this thesis focuses on the statistical machine learning and deep learning-based named entity recognition model and relationship extraction model.The model of this thesis is based on advanced word vector technology.It is guided by the theory of traditional machine learning and deep learning.In addition,this thesis analyzes,trains,and compares the effects of the models in this thesis.The main work of this thesis includes the following aspects:1.The traditional stacked Markov named entity recognition needs to manually summarize the composition rules of the entity domain named entities.This thesis combines word vector to let the model learn the composition of named entities.This approach increases the automation of the algorithm and reduces the dependence of the algorithm on prior knowledge,making it more versatile to use and improve its crossdomain use.2.In the relationship extraction task,a variety of word vectors and deep learning theory are used to construct the model.Use transformer to solve the polysemy problem.And use absolute position embedding and relative position embedding to solve the network's capture of word order information.3.A text analysis processing system is constructed,and the system integrates various natural language processing related models.The system is based on software design methods such as low-coupling strong cohesion,combined with 54 algorithms,storage and read processing modules.The realization of the text reading,preprocessing,information extraction,extraction of knowledge storage and knowledge query and display functions.
Keywords/Search Tags:Named Entity Recognition, Relation Extraction, Deep Learning, Statistical Machine Learning
PDF Full Text Request
Related items