Font Size: a A A

Research On Question Similarity Computation Of Domain Question Answering System

Posted on:2015-03-02Degree:DoctorType:Dissertation
Country:ChinaCandidate:Q S WanFull Text:PDF
GTID:1318330518471557Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of information technology,people are increasingly relying on various information retrieval systems to obtain the required knowledge.Q&A system is a new generation of intelligent information retrieval system,it is capable of analyzing issues raised by users automatically,and then presenting appropriate answer in accordance with the user question,Question Answering system can better meet the needs of information retrieval.The key problem of Question Answering system is to understanding user questions,a solution of this problem is Questions similarity calculation whose purpose is to analysis and understanding user questions,the accuracy of the results of Questions similarity calculation directly determines the correctness of the answer.This thesis studies the Questions similarity calculation method of Q&A system which is applied in social security audit areas.The following work has been carried out in this thesis in order to meet the challenge of question similarity calculation in Q&A system.1)A knowledge-assisted question classification method is proposed.This method first takes audit methods extracted from audit methodology as expert knowledge to construct training set,and then get a decision hyperplane by the help of Support Vector Machine on the training set,this decision hyperplane can be used to select ajust sample located in the classification margin.Finally,a relearning process is performed with selected sample until the unselected sample set in ajust set is empty or classification performance meets the requirements.This algorithm utilizes active learning strategies to select the most important samples for classification by the help of audit methodology,which can obtain better classification performance at higher efficiency.2)A question similarity calculation method based on formal concept analysis is proposed.This method utilises the semantic and syntactic structure questions to calculate question similarity,and it can take advantage of formal concept analysis to extract and build set of domain concepts to improve the accuracy of questions similarity calculation by constructing concept lattice and the concept set similarity calculation.This FCA based questions similarity calculation method can convert the question similarity calculation into concepts vector similarity calculation,and then analyzes and calculates the similarity with FCA concepts,which makes questions similarity calculation more accurate calculations and stability.3)A domain question similarity calculation method in support of ontology is proposed.This method first gets the similarity relations introduced by the experts from domain ontology,and then constructs the similarity graph of domain ontology and formal context with the help of similarity relations.Similarity graph can be utilized to provide expert knowledge for the similarity calculation of FCA concepts to improve the accuracy of FCA concept similarity.Further more,d-transitive similar relationships is introduced in the similarity graph in order to bound transitive similar relationships,which improves the similarity calculation efficiency of FCA concepts without loss of accuracy of calculation results.On this basis,a construction method which is able to construct bipartite graph from d-transitive similarity graph is proposed,thus,the problem of finding candidate set with the maximum weight in the process of calculate domain question similarity based on FCA is converted to the problem of maximum weight matching of bipartite graph.Thus,the candidate set with maximum weight of dual pair can be achieved by solve the complete matching of the equal sub-graph of bipartite graph constructed from d-transitive similarith graph,which can further improve the efficiency of domain question similarity calculation.4)A domain question similarity calculation method based on information content is proposed.This method introduces semantic based information content measurement into the FCA concept similarity matching method which is implemented with a probability based information content calculation method.Then a information content based domain question similarity calculation method using this measurement is proposed,it can obtain the similarity between FCA concepts automatically without relying on human experts,and it also is able to be independent of the corpus while the limitation of probability based information content method has to rely on the corpus.This method introduces the upper and lower semantic relations to calculate the similarity of FCA concepts,which makes it able to reflect the general and specific degree of concepts more accurately.This method take the proportion of general and specific levels as the measurement of information content,so that the accuracy of question similarity calculation is significantly better than the accuracy of the method using probabilistic information content.
Keywords/Search Tags:Question Answering System, Question Similarity, Question Classification, Formal Concept Analysis, Ontology, Information Content
PDF Full Text Request
Related items