Font Size: a A A

Research On Question Answering System Based On Learning To Rank

Posted on:2019-09-14Degree:MasterType:Thesis
Country:ChinaCandidate:W N LiFull Text:PDF
GTID:2428330566995992Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
The accessibility and availability of massive Internet data have greatly promoted the research of information acquisition technology.However,the continuous expansion of network resources has made it impossible for people to to get the information quickly.Based on this,the question answering system has been widely used in medical,education,business,tourism and other fields.Several answers to user query questions are presented to the user in a sorted manner,which not only improves the user experience but also facilitates the further development of the question answering system.Traditional answer ranking considers only a single feature as a ranking criterion,or an unreasonable setting of the weight of the feature,resulting in poor performance of answer ranking.In order to solve this problem,this paper studies the primary characteristics and the extraction of the advanced features of the questions and answers,and aims to consider the intrinsic relationship between the questions and the answers from the feature point of view.Then,normalization and ranking based learning algorithm are applied to hide the feature selection,eliminate the magnitude influence between features,and reduce redundant features and noise.Using the features of primary text and advanced text as input,the weights of features are obtained by using the ListWise sort learning method,which is used to optimize the sorting results.At present,when calculating short text similarity such as question sentence,most of them are based on word frequency,and simple use of cosine method will restrict the accuracy of operation and have dimension disaster.In order to solve the above problems,this paper adopts the word2 vec distributed word vector short text representation method,and divides the similarity calculation involved in the question answering system into two categories,one is the similarity degree calculated between different problems,that is the similarity calculation of short text,and the other is to calculate the similarity between the given problem and all the candidate answers.In the word vector space,the EMD distance feature is introduced to calculate the minimum moving distance that all words in the text move to the corresponding word of another text,and use a mixed multi feature mixed strategy to improve the accuracy of similarity calculation.In order to enhance the fuzzy matching ability of question answering system,the recall rate of similar questions can be raised by expanding the key words of questions.Considering the accuracy and efficiency of the system,it is mainly aimed at expanding the nouns and verbs in the sentence key words.Finally,through the implementation of the system,the correlation degree of the candidate answers to the user problem is displayed.It is proved that the relevant improvement method of this paper is feasible in practice.
Keywords/Search Tags:Question Auswering System, Learning to Rank, Feature Processing, Word Vector, Short Text Similarity Calculation
PDF Full Text Request
Related items