Font Size: a A A

Study On Lexical And Phrasal Paraphrase Extraction Based On Context Analysis

Posted on:2018-11-25Degree:MasterType:Thesis
Country:ChinaCandidate:L J WangFull Text:PDF
GTID:2348330533469807Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In real life,people often use different text descriptions when expressing the same information,which is called the paraphrase.Because of the phenomenon of paraphrase,it also makes many natural language processing tasks become complex and difficult.Lexical paraphrase extraction and phrasal paraphrase extraction want to extract the same semantic vocabulary and phrases.And the paraphrase resources have important applications in natural language processing tasks such as question answering,information retrieval,machine translation,text generation,and so on,and the paraphrase resources can improve the performance of related natural language processing systems.In this paper,the main research contents of the study on lexical and phrasal paraphrase extraction based on context analysis contain the following three aspects: the study on lexical paraphrase extraction based on context analysis,the study on phrasal paraphrase extraction based on pivot method,and the study on phrasal paraphrase extraction based on context analysis.First of all,this paper proposed a method to extract the lexical paraphrase based on context analysis.At present,the researches of lexical paraphrase extraction are mainly using the pivot method to extract lexical paraphrase from the bilingual parallel corpus.In this paper,the pivot method is used to extract the cand idate lexical paraphrase from online translation resources of Chinese words so as to avoid the alignment errors of bilingual parallel corpora and avoid extracting erroneous lexical paraphrase that doesn't have same translation.The context of the word is used to learn the word vector,and then the score computed by the feedforward neural network and the similarity score are combined as the final score of lexical paraphrase,and this paper uses the final score to sort and filter these lexical paraphrase reso urces.This paper uses the context information and other information to filter the lexical paraphrase,and this can reduce the errors resulting from the polysemy of the foreign translation.The result of manual evaluation of the extracted lexical paraphras e resource shows that the quality of lexical paraphrase resource extracted by this method is superior to the traditional pivot based method.Secondly,on the basis of the phrasal paraphrase extraction based on pivot method that commonly used,this paper use the translation probability and context information to filter the phrasal paraphrase resources that avoids the problems that the pivot method is used to extract the wrong phrasal paraphrase because of bilingual alignment errors and the polysemy problems of foreign language translation.And the experiment results show that using the context information to filter the candidate phrasal paraphrase resources can greatly improve the quality of extracted phrasal paraphrase.Finally,this paper proposes a method to extract the phrasal paraphrase based on context analysis.In this method,two layer Bi LSTM-CRF model is used to segment phrases in Chinese monolingual corpus,and then this paper uses a deep learning model to learn the phrase vector representation.And these phrases that have high cosine similarity value are extracted as the phrasal paraphrase.And this paper uses the English translation of the words to filter these candidate phrasal paraphrases and proposes a method to learn the phrase context vector.And these candidate phrasal paraphrases are sorted by the context vector similarity.The experiment results show that the neural network model can learn the semantic vector representation of the phrase,and the quality of the phrasal paraphrase resource a fter the filtering and sorting steps is much higher than that phrasal paraphrase resource extracted by the pivot based method.
Keywords/Search Tags:paraphrase extraction, context information, phrase segmentation, deep learning
PDF Full Text Request
Related items