Font Size: a A A

Researches Of Sentences Similarity Computation Method Based On Hownet

Posted on:2007-11-26Degree:MasterType:Thesis
Country:ChinaCandidate:Y J ZhangFull Text:PDF
GTID:2178360185997373Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Text similarity is a systematic research subject, the similarity computation of different levels of text is related each other closely. In this thesis, similarity computing on various levels was studied and put the emphasis on sentences similarity computation. Firstly, different methods of word similarity computation were analyzed and realized the Hownet based word similarity computation method which is one of methods giving good performance at present. Secondly, similar word pare cooccurrence based and semantic expression model based two new sentence similarity computation methods were presented. Finally the new sentence similarity computation methods were used in Bank-Domain Automatic Chinese Question-Answering System to test the feasibility and validity of them. To be more specific, the main work and result in this thesis were as below:(1)Analyzed the methods for Chinese word similarity computing, and realized the Hownet-based word similarity computation method.To the present , the mainstream of Chinese word similarity computation is based on semantic dictionary, especially based on Hownet. This method is better than the same character based method for it compute the similarity by the concepts represented by the words and it also avoids the effect of data noise and data sparseness in statistic based methods. This method was realized so we can use it in the computation of sentences similarity.(2)A new sentences similarity computation method based on similar word pare cooccurrence was present. Because of the difficulty of sentence structure analyses, using the similar word between the sentences to compute the similarity is the main method of sentence similarity computation. The words in the sentence are related each other by syntax and semantic. The cooccurrence of similar words between sentence has the mutual inspire contribution to the similarity. A formula for computing the mutual inspire contribution was given, and based on which computing the sentence similarity.(3) The application example of sentences similarity computation in the Question Answering System. Question answering system provides the human-machine interface by means of natural language. Comparing to the traditional search engine which is based on keyword, question answering system is more accurate, simply and efficiency. The similarity computation was used in the similar questions search in FAQ base of QA, which give the example how the similarity computation was realized in practice. And through similar question search experiment the feasibility and validity of the new methods were testified.
Keywords/Search Tags:Chinese information processing, sentences similarity computation, Hownet, question-answering
PDF Full Text Request
Related items