Font Size: a A A

Improved Sentence Similarity Algorithm Research And Its Application In Question Answering System

Posted on:2011-05-20Degree:MasterType:Thesis
Country:ChinaCandidate:D J WangFull Text:PDF
GTID:2178330332483499Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Question answering system is becoming the focus of the research, because that it can give accurate answers to users fleetly. In all kinds of question answering systems, finding similar sentences is the key to determine the accuracy of the system, and then the research of the sentence similarity algorithm becomes a very basic and important work.In this paper, we research the existing sentence similarity algorithm, and consider the advantage and disadvantage of each algorithm. According to the disadvantage of existing sentence similarity algorithms, such as the separation of statistical information and semantic information, too simple weight calculation, etc., we propose an improved sentence similarity algorithm based on VSM, the improved algorithm performance is greatly enhanced. This improved algorithm attempts to solve the Chinese sentence similarity problem by integrating dynamically the traditional sentence similarity algorithm based on VSM into the traditional sentence similarity algorithm based on semantics. Firstly, we do the word sense disambiguation and the word sense tagging based on HIT IR-Lab Tongyici Cilin (Extended), select words with the same or similar meaning to abstract a concept, and consider the concept to be a basic linguistic and statistical unit of the sentence. Secondly, when we calculate the concept weight, we not only consider the TF-IDF weight calculation method but also consider the professional weight by distinguishing the weight of the professional word and the general word.Finally, we use the question answering system in the medical care field as an example, which we apply the improved sentence similarity algorithm into, and evaluate the efficiency and accuracy of the improved sentence similarity algorithm. The result shows that the efficiency and accuracy of the improved algorithm have been greatly enhanced.The research and result of the improved sentence similarity algorithm has practical value and application prospect to all areas of the question answering system.
Keywords/Search Tags:Question Answering, Sentence Similarity, Word Sense Disambiguation, Tongyici Cilin, VSM
PDF Full Text Request
Related items