Font Size: a A A

Research On Word Embedding Based Chinese Lexical Entailment Knowledge Acquisition

Posted on:2017-08-08Degree:MasterType:Thesis
Country:ChinaCandidate:H X ZhouFull Text:PDF
GTID:2348330488970963Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Textual entailment recognition is one of the important research content in the field of natural language processing. Textual entailment recognition has many important applications, such as information retrieval, information extraction, question answering system and machine translation and so on. Many studies have shown that the more extensive the lexical entailment rules are, the greater the help for textual entailment recognition is.To obtain large amounts of lexical entailment knowledge has a great significance in textual entailment recognition and related applications. Therefore, with the aid of large-scale text corpus, collecting a large number of lexical entailment rules is the key to improving the textual entailment recognition performance. In extracting lexical entailment rules, it often needs to determine whether implicating relation exists between the given two words, namely lexical entailment recognition.In the aspect of lexical entailment knowledge acquisition,study on Chinese lexical entailment is not sufficient while there have many studies on English lexical entailment from different points of view. Many identification models are put forward in terms of the English lexical entailment relation recognition.This paper learn word vector representations at first using current popular word embedding method on Chinese Wikipedia corpora. On the basis of this,we present the two different ways to identify Chinese lexical entailment:(1) Recognition of lexical entailment relation based on word embedding com bination features. At first we construct a variety of effective word vector combi nating features, including add features,difference features, product features,concate nating features and combination features between the word vectors and so on. T hese features reflect different lexical entailment relations characteristic. Support V ector Machine(SVM) model for candidate noun lexical entailment classification are trained.(2) Recognition of lexical entailment relation based on word embedding Sem antic condensation degree. we put forward a measurement value called “Semantic Condensation Degree”,the measured value can well reflect implicating relation between the given two words.After calculating the measured value,we construct many features based on semantic condensation degree. Support Vector Machine(SVM) model for candidate noun lexical entailment classification are trained.The experimental results show that the lexical entailment relationship classification methods presented in this paper have good performance on Chinese noun lexical entailment classification. The training and test data for Chinese noun lexical entailment,python codes implementing these mehods in this paper and experiment results can be downloaded on given web site, which provides a corpus and experiment foundation for further Chinese lexical entailment research.
Keywords/Search Tags:Textual entailment, Lexical entailment, Word embedding, Entailment feature, Semantic Condensation Degree
PDF Full Text Request
Related items