Font Size: a A A

Answer Selection And Answer Validation For Complex Chinese Question Answering

Posted on:2010-08-26Degree:MasterType:Thesis
Country:ChinaCandidate:B J XuFull Text:PDF
GTID:2178360278966399Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
Question Answering is one of the research hotspots with high academic and application value, in the field of Nature Language Processing and Information Retrieval. At present, QA focuses on the challenge of complex questions instead of factoid ones, and one of the difficulties is that the system usually returns more the same answers while dealing with complex questions. Therefore, it is necessary to remove the redundant answer candidates, which is called answer filtering. Meanwhile, answer validation is also an important module to improve the performance of the system.Literature shows that there have been many studies on answer filtering and answer validation for factoid answers. Generally, the target answer of factoid questions, which has defined types and short length, is one word, phrase, number, time spots, or name entity. However, the target answer of complex questions usually is long sentence with semantic structures. Most current methods of answer filtering and answer validation, based on type checking or synonymy clustering, can not been utilized on the QA system for complex questions.The paper studies the answer filtering and answer validation for Chinese complex questions. Firstly, a method of word similarity calculation based on HowNet dictionary is used, and then, the similarity between sentenced is computed with an algorithm based on weighted name entity, which is developed from improved edit distance. Secondly, we proposed an answer reranking method which makes use of the statistic information of name entities in answer candidates set. The candidates are reranked according to the information entropy score. Then the information value obtained in reranking process is set to be the weight value in sentence similarity calculation. This method highlights the importance of name entity and measures the candidate similarity according to the related information similarity, which may bring duplicated answers removal on pragmatic level.The proposed approach is experimented on the testing corpus of NTCIR-7 CLQA Track, compared to the content-based answer validation on the Web. The result shows that the former one is superior on the questions of Definition or Relationship type, but doesn't work well on Event questions.
Keywords/Search Tags:Question Answering, Sentence Similarity Calculation, Information Entropy, Answer Selection, Answer Validation
PDF Full Text Request
Related items