Font Size: a A A

Research On Key Technologies Of Question Answering System Serving "Agriculture, Farmers, Rural Area"

Posted on:2013-10-08Degree:DoctorType:Dissertation
Country:ChinaCandidate:J L ZhangFull Text:PDF
GTID:1368330473959269Subject:Information Science
Abstract/Summary:PDF Full Text Request
With the promotion of the information needs, the rapid growth of the information resources of "Agriculture, Farmers, Rural Area" (AFR), and the constant improvement of the AFR information infrastructure in rural areas, how to enhance information service to meet the information needs has become an urgent problem. The informatization of AFR is the important part of China's informatization. Question Answering (QA) system can more accurately and automatically extract the answer of the question, which was questioned in natural language, from a wide range of information resources. So, to build a QA system serving AFR will be able to promote the application of AFR information and has a positive significance for famers, researchers and policy makers by applying the QA in the AFR information service.On the basis of the above backgroup and technology, this paper aims at building a QA system serving AFR. Firstly, the paper described the basic concepts and framework of QA system and research topics at home and abroad, the research contents and methods, significance and the basic structure of this paper. Secondly, the some theories of Chinese information processing are summarized, which is used as the basis of the study. Thirdly, the AFR knowledge expression based on AFR concept cluster, FAQ system severing AFR based on the mixed strategy, the classification of AFR question, and answer extraction severing AFR which are the key technologies of the QA system severing AFR, are studied respectively. Finally, building a QA system severing AFR is described. The main research work of this paper is as follows:Part one, the research on the AFR concept cluster based on k-Nearest Neighbor (KNN). First of all, all entries and their interpretation content are extracted from the online "Agriculture Dictionary" by the DOM tree method, using the regular expressions the spoken names are extracted and the AFR table is designed. Then, against the automatically extracting words from the interpretation content of the entries, the feature words are produced through artificial selection and merging. The feature vector is produced and dimensionality reduction using KL transforms are executed, and the AFR concept cluster is produced using the KNN. Finally, the experiment shows the methods are valid.Part two, the research on the retrieval method of FAQ system severing AFR using the mixed strategy matching. The similarity of the words surface between the questions is calculated through the same words coverage, the length and word order. The semantic similarity is calculated using Hownet and the AFR concept cluster. The similarity between the user's question and the question answer pairs is calculated using Latent Semantic Analysis (LSA). Then take a mixed strategy to group these similarities and form the retrieval method of the FAQ severing AFR. Finally, the effectiveness of the method is verified by the experiments.Part three, the research on the AFR question classification system and method. The question classification system of automatic QA system severing AFR is designed based on the classification system of open domain and the AFR domain knowledge. We consider Wh-word, the AFR concept cluster and HowNet sememes as classification features, calculate characteristic value by the information entropy and design the algorithm of a template-based coarse classification and the algorithm of fine classification SVM-based. The experiments show that the feature vector and classification method in this part can effectively meet the demand.Part four, the research on the answer extraction of automatic QA system severing AFR. According to different question category and answer source, this paper proposed different methods of extracting answer. The method based on the AFR knowledge base is for the AFR factual questions. The method using template cues words of reason is for AFR reason question. For the AFR "how" questions, the method based on automatic summarization extraction is proposed. These algorithms are also validated by experiment.Part five, the construction and realization of QA system severing AFR. The part describes the system environment, the programming techniques and tools of the system realization, the design of construction and the effect of realization.In the last part, we draw up the contribution of the research, and indicate the shortcomings of our research and discuss the future work.
Keywords/Search Tags:Question Answering (QA) Serving "Agriculture,Farmers,Rural Area"(AFR), AFR concept cluster, Frequently-Asked Question(FAQ) of AFR, AFR question classification, answer extraction
PDF Full Text Request
Related items