Font Size: a A A

Research On Natural Language Semantic Feature Representation For Knowledge Base Question And Answer

Posted on:2018-10-19Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z H WangFull Text:PDF
GTID:1318330518970157Subject:Network and network resource management
Abstract/Summary:PDF Full Text Request
The question answering over knowledge base (KBQA) can obtain answers to questions of users by use of the match and the reasoning of the information in the knowledge base, which is an important part of question answering (Q&A). And it aims at automatically understanding the ques-tions proposed by users, and extracting answers from the information on the network. The key of KBQA is to understand natural languages in depth, and make using of deep learning technology to map the questions and the knowledge base together to the low dimensional semantic space, and then transform Q&A into the vector similarity measure of their semantics. Therefore, KBQA should focus on the semantic feature representation of natural languages. At present, the difficul-ties in KBQA are mostly centered on it, mainly including the following aspects.(1) At present, there is no unified representation method for questions in natural languages,as well as a lack of thorough research on the semantic relations among di fferent questions.(2) Natural language representation is usually ambiguous, and may contain different mean-ings for the same text in different contexts, so that it is rather difficult to understand the question semantic correctly.(3) Deep learning algorithm can be used to transform the structured knowledge base to the semantic features corresponding to the question representation, which needs to be improved to adapt to the rapid growth of the knowledge base.(4) The semantic features of the knowledge base are derived from different, correlated and different structured knowledge bases. Therefore, it is necessary to automatically and effectively generate answers making use of the knowledge base semantic features from different sources.Based on the problems and deficiencies in KBQA, the thesis mainly concentrated in four contents such as the improvement of the question representation, the feature selection of the ques-tion semantic, the semantic representation and the semantic clustering of the knowledge base.(1) For the semantic representation of the question, a quantum distributional represen-tation method was proposed based on quantum theory.The character level quantum distributional representation used the quantum state, the quan-tum superposition state, the unitary operator and quantum mixed state theory to represent the basic characters, words, phrases and dynamic texts, respectively. And the quantum embedding was trained by the same method of word embedding. This method can reflect more abundant morpheme features, and represent the semantic relations among texts more thoroughly. In addition, it repre-sented the words, sentences and long document level texts as the same dimensional density ma-trixes by means of the density operator, while the length of the input text was not preprocessed to the uniform-size. The experiment showed that the proposed quantum distribution representation method was more effective than the compared models in some tasks such as word similarity, syn-onym detection, text classification, and sentiment analysis.(2) For the semantic understanding of the question, a semantic feature selection method was proposed based on the convolutional neural network model.It introduced multilayer perceptron convolution to enhance the abstract ability of nonlinear separable concepts, which adopted the Dropout strategy to further improve the efficiency of the model. Finally,it achieved the semantic feature selection of quantum embedding with the im-proved model. This method adopted the quantum distributional representation as the input of the model, that is, it required no preprocessing of the morphological annotation conducted for the text and pre-trained word embedding in the input layer. Moreover, it downsized the model parameters to a great extent by introducing the multilayer perceptron convolution. The experimental results showed that the feature selection method of the convolutional neural network based on the quan-tum semantic space represented more abundant semantic features as well as spelling features of words.(3) For the semantic representation of the knowledge base, a method was proposed based on the knowledge graph and corpus jointly embedding.It used the knowledge graph and corpus jointly embedding method to achieve the jointly embedding of the quantum distributional representation and the knowledge graph,thus to improve the efficiency of the automatic expansion of the knowledge base. The method improves the utili-zation efficiency of the semantic relations among the quantum distributional representations.What's more,the scale of the quantum distributional representation is much smaller than that of the word embedding representation with the same vocabulary,so that the text model can be calcu-lated directly. The experimental results showed that the performance of the method was better than the compared methods in triplet classification, relation extraction and link prediction.(4) For the answer generation, a semantic clustering algorithm was proposed based on biogeography-based optimization.It introduced the affinity propagation strategy to biogeography-based optimization so as to enhance the ability to mine relations between data, adopted the Memetic framework to strengthen the global search ability, and finally realized the semantic clustering by use of the peak density clustering algorithm. The method can substantially mine the deep relations among the semantic representations, and raise the effect of semantic clustering by improving the global search ability of the biogeography-based optimization. The experimental results showed that the method was more accurate and effective than the compared algorithms.
Keywords/Search Tags:Question answering over knowledge base (KBQA), quantum semantic, Convolutional Neural Network(CNN), knowledge graph, semantic clustering
PDF Full Text Request
Related items