Font Size: a A A

Research On Multi Source Knowledge Based Geographical Choice Question Answering Method

Posted on:2016-11-14Degree:MasterType:Thesis
Country:ChinaCandidate:J X LinFull Text:PDF
GTID:2308330503451119Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Since the 21 th century, question answering system has been widely studied by academia because of its wide application prospect. It is of great strategic significance to investigate how to construct the mass knowledge resources of the elementary education and the corresponding knowledge map, as well as to develop a human-like Question and Answer(Q&A) system which has the ability to answer questions through analyzing and extracting the mass knowledge resources.In this paper, the characteristics and difficulties of solving the question of the geography examination is investigated by analyzing the examination questions of the college entrance examination around China. Meanwhile, according to the differences in solving process, questions are divided into 7 categories, such as conceptual questions, which is the main focus of this paper. Collected from geography textbooks and some database like Wikipedia, Baidu Baike and so on, text data are automatically processed by filter algorithm. Additionally, a label system of the geographical examination questions is built for classifying and labeling question resources of geography examination questions.During the process of answer the geography examination questions, the emphases and difficulties of the research lies in the correct comprehensions on geographical entities in the questions. In this paper, a program aimed for the extraction and deduplication of the task of geographic entities’ is written through analyzing the multi-source geographic knowledge documents. Based on the distributional features of geographical entities in the encyclopedia document, a method for calculating the transfer distances and distances between entities is proposed. In the meantime, the Floyd algorithm is modified to improve the efficiency in building and refreshing the entity relation and distances. Finally, An entity relation network including distances between entities is built on the base of calculation.The process of solving the choice questions can be converted into the calculating and ranking process of the confidence level of the candidate choices. Therefore, in this paper, a method based on document correlativity and sentence similarity is presented to evaluate the confidence level of one candidate choice. Then, according to the entity relation network, a method to merge the information in the questions is proposed and applied to the computing process of confidence level. In addition, the implementation of the SVM and logistic regression method in choice ranking process is also explored by utilizing the features of the texts gained from the computing process of confidence level.In order to have a better demonstration of the geographical Q&A system, an online examination system is built for the users who are interested in it. What’s more, a series of comparative experiments are designed to compare the performance of different methods, such as confidence level ranking, SVM, logistic regression, under the conditions of with or without the information from entity relation network. According to the result of experiments, the accuracy of this Q&A system been improved from 0.311 to 0.402 through using confidence level ranking method with entity relation network, which proves the effectiveness of the method proposed in this paper.
Keywords/Search Tags:Q&A system, answer ranking, entity relation network, knowledge base construction, geography choice question
PDF Full Text Request
Related items