Font Size: a A A

Research Of Definiton Extraction And Definition Expansion

Posted on:2018-06-25Degree:MasterType:Thesis
Country:ChinaCandidate:J Y WuFull Text:PDF
GTID:2348330536479946Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In recent years,with the rapid development of big data and the Internet,information extraction(Information Extraction,IE)has become a hotspot in natural language processing.Based on the semantic link network in Wikipedia,this paper studies the definition extraction and definition expansion of concepts.The specific contents are as follows:(1)Based on Wikipedia,the TRAA algorithm is proposed to study the semantic relations of definite sentences.First,this paper uses web crawler technology to obtain Wikipedia's web information and manually filters out the definition sentence as the training data for definiton extraction experiments.Second,the definition sentence is transformed into the term set storage by term recognition.Finally,this paper analyses the structure of the Wikipedia semantic link network and calculates the definition tightness of the term set according to the network.(2)A definition extraction model based on rules and statistics is proposed.First,the model selects the scientific literature of the computer field as the extraction object,and formulates the corresponding matching patterns according to the characteristics of the definition sentence in the scientific literature.Second,the definition tightness and the number of terms are used as features to train the model based on the statistical analysis,and the model will filter the candidate sentences which selected by the matching pattern again.Finally,this paper designs the experiments and analyzes the experimental results by the evaluation parameters.(3)A window-based definition expansion model is proposed.Definition expansion model uses text segmentation to analyze the relationship between the definition sentence and the context.The result of the definiiton extraction is a single sentence,but the results of definition expansion are a single sentence or multiple sentences.The steps to implement the definition expansion model are divided into three steps.First,the model obtains the definition paragraph which the paragraph contains the definition sentence in the scientific and technological literature.Second,the model calculates the semantic distance between sentences based on the wikipedia semantic link network.Finally,the modle selects the sentences which semantic similarity values are higher than threshold in the context as the definition expansion.(4)Design and implemente the Definition Dictionary System.The main functions of the system are definition query and definition expansion query based on the keywords entered by the user.The query of the term definition contains three steps.First,the system determines whether the keyword exists in Wikipedia's semantic link network.Second,the system determines whether the definition of keyword exists in corpus.The system will select sentence with the nearest definition sentence as a definition of the query result.Definition expansion query also contains three steps.First,the system obtains the definition paragraph corresponding to the definitons in the literature.Second,the system calculates similarity between definition and the context.Finally,the system filters out the definition expansion sentences as a result of the output based the threshold.
Keywords/Search Tags:Definiton Extraction, Definition Expansion, Semantic Link Network, similarity calculation
PDF Full Text Request
Related items