Font Size: a A A

The Research On Hypernym Extraction From Definitions Based On Deep Learning

Posted on:2022-09-02Degree:MasterType:Thesis
Country:ChinaCandidate:Y X TanFull Text:PDF
GTID:2518306530498244Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Text data is an important carrier of Internet content.Internet applications such as social media,Q&A communities,and news platforms generate massive amounts of text data every day.Facing the vast sea of textual materials,exploring how to efficiently locate and obtain useful information and knowledge has strong application value.For example,the continuous development of search engines,knowledge graphs,natural language processing and other technologies provides an effective way for text information mining.The upper-lower relationship of the concept or entity in the text is an important knowledge.Extracting and identifying the upper-lower semantic relationship is helpful to promote the construction of the knowledge graph,and then provide better support for the upper-level knowledge application.At present,public dictionaries(such as Word Net,etc.)can only obtain concepts in the public domain and a small number of professional fields and the relationships between concepts,which cannot meet the knowledge needs of professional field research.For this reason,it is very necessary to study how to effectively extract the subordinate relationship from the texts of various professional fields.Defining sentence is a special text used to describe or define a concept.Its syntax is relatively fixed,and it is suitable as a corpus for the extraction algorithm of upper and lower relations.The research on the extraction of supernumerary words in definition sentences has a long history.In the early days,researchers used rule-based methods to solve such problems.This method requires manual formulation of rules,which is time-consuming,labor-intensive,and ineffective.With the rise of statistical machine learning methods,machine learning methods have made considerable breakthroughs in this issue.With the rapid development of deep learning in recent years,methods based on distribution and sequence information processing have made the extraction effect of this problem more effective.Has been further improved.However,current methods mainly focus on special syntactic structure and traditional word representation characteristics,and the application of the method has certain limitations.This paper proposes a syntactic structure feature representation based on part of speech,and constructs a hypernym extraction model based on bidirectional GRU(Gated Recurrent Units).Compared with the traditional word embedding method,the syntactic structure feature representation based on part of speech can better reflect the syntactic structure information of the hypernym in the definition sentence.The extraction model based on the two-way GRU can obtain better results in the extraction effect and training time..This method can be further modified by the inclusion of degree centrality features and traditional text word features after the sequence feature learning has obtained the output,thereby improving the accuracy of the extraction of hypernyms.By verifying on two datasets with a higher degree of syntactic specification(Wikipedia)and a lower degree of specification(Stack-Overflow),the experimental results show that the method in this paper has higher accuracy and higher accuracy than other current methods.Universality,and achieved F-scores of 91.2% and92.4% better than the existing hypernym extraction methods.It is worth mentioning that our method not only provides an effective method for hypernym extraction,but also we provide an example of using syntactic structure to learn semantic relations.
Keywords/Search Tags:Hypernym Extraction, Syntactic Structure, Word Representation, Part of Speech, Gated Recurrent Units
PDF Full Text Request
Related items