Font Size: a A A

Ellipsis Recovery Technology In Interactive Question Answering System

Posted on:2011-12-28Degree:MasterType:Thesis
Country:ChinaCandidate:Z WeiFull Text:PDF
GTID:2178330338989581Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Ellipsis is a very common phenomenon in Chinese language. The research in Chinese ellipsis has been a hot topic for many years. It is very important for research area like question and answering, machine translation, information retrieval etc. This paper introduces a way to implement ellipsis recovery in the area of interactive question and answering.There are a lot of ellipsis questions in interactive question and answering system. Statistics show that about 40% of the question are ellipsis ones. Without ellipsis recovery, there?s liitle hope for the system to give a correct answer. So ellipsis recovery is of vital important for interactive question and answering system.This topic uses Wikipedia and Hownet to build a semantic relation table. New words are extracted from Wikipedia according to the classification information and pattern matching. The semantic web includes semantic relationships like hyponomy etc. These ralationships will be used for ellipsis recovery.The questions are classified into seven categories. Machine learning is used to do the classification. The features selected include semantic relation, sentence structure and so on. Then four kinds of different classify algorithms are compared among which C4.5 is the best, the precison is 76.15%.After classification, the system will recover the ellipsis questions according to their categories. The main task is to determine which part is missing and to determine the best candidate for the missing part. Semantic relations, distance and other informations are used to make the determination.At last, we evaluated the result of ellipsis recovery, the precision on two different corpuses are 82.9% and 75.3%. We also comprared the result with other researchers, the method used in this topic got a better result.
Keywords/Search Tags:ellipsis recovery, semantic relation, interactive question and answering, Hownet, Wikipedia
PDF Full Text Request
Related items