Font Size: a A A

Research On Named Entity Relation Extraction Based On Web Text Mining

Posted on:2018-09-25Degree:MasterType:Thesis
Country:ChinaCandidate:H L LaiFull Text:PDF
GTID:2428330566954145Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
Named entity relation extraction is an important research topic in the field of information extraction,in the view of the application,it belongs to one of the key technologies such as intelligent search,automatic question answering,knowledge mapping system;in the view of the basic theory research,it has important research significance of Natural Language Processing technology such asmachine translation,text classification,automatic abstract,word discovery.The current domestic and foreign research on named entity relation extractionmost concentrated on the seven main types of relationship,which defined by the AC E(Automatic Content Extraction),and the named entityresearchin the field of agricultural information are relatively rare.In addition,most of the research methods focus on the knowledge engineering method or machine learning method,and most of the machine learning methods is supervised or unsupervised,which is rare for the combination of artificial and machine learning methods.In view of named entity relation extraction research situation at home and abroad,we focus on the extraction of named entity relations in the field of bananain this paper.We applied both artificial and machine learning method.Specifically,this research work includes the following aspects:(1)Constructing a banana oriented named entity corpus.Based on the analysis of the characteristics of agricultural information,a directional crawler was designed a nd the web pages of banana were collected.Then designsthe Web information extraction model based on Web page features,and then finishinformation extraction about web page.F inally extract the named entity pairs,completed the construction of the agricultural information named entity corpus.(2)Research on named entity relation extraction for banana.We put forward a kind of named entity relation extraction model based on Word2 Vec and seedself-expanding method according to the established named entity corpus in this paper.The core of the model is that the named entity pairs are transformed into numerical vectors,and the similarity between named entity pairs is represented by the similarity between the numerical vectors.This model is used to extract the named entity relation of the named entity corpus of agricultural information,and the experimental results are designed and analyzed.(3)Design and implementation of agricultural information named entity relation extraction system.According to the function of the system,the overall architecture of the system is designed,and the detailed design of the system is completed.By using the named entity relation extraction method designed in this paper.Several experiments are made on the corpus of banana named entity database.Achieved an average accuracy of 78.4%,an average of 60.2% recall,with good results,and verify the validity of the method.
Keywords/Search Tags:Named entity relation extraction, Web text mining, Named entity corpus, Word2Vec, Information extraction
PDF Full Text Request
Related items