Font Size: a A A

Research On The Re-use Of Community Question Answering Knowledge

Posted on:2012-03-20Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y B CaoFull Text:PDF
GTID:1118330362958322Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Given (user) questions as input, traditional question answering systems tryto provide answers to the questions by retrieving and analyzing documents. How-ever, due to the complexity of involved document processing techniques, it is notso easy for such systems to scale to open domain questions. Thus, this thesisattempts to achieve automatic question answering by re-using community ques-tion answering knowledge, instead. In this thesis, the knowledge is representedas pairs of question and answer, which are automatically mined from communityquestion answering services, online forums, and FAQ systems. Thus, comparedwith traditional question answering systems, the proposed automatic questionanswering is able to not only avoid using complex document analysis but alsoprovide more precise answers (as we have questions and answers already). Usingthe divide-and-conquer approach, the problem of re-using community questionanswering knowledge is further decomposed into four sub-problems as follows:1. Extraction of questions and answers: The problem targets at extractingquestions and answers from community question answering services, FAQservices, and online forums, and then storing them as knowledge for futurere-use. Because the problem of extracting questions and answers from thefirst two types of system is not so diffcult, this thesis focuses on the researchof extracting questions and answers from online forums. To address theproblem, this thesis proposes a novel form of graphical representation, andthen on the basis of the new representation introduces a new structuralsupport vector model.2. Question search and recommendation: Question search is one of the mostused mechanisms for the question-answering knowledge re-use. Speciffcally,given a question as query, question search is to return questions semantical-ly equivalent or close to the queried question. Question recommendation is another novel mechanism (for the knowledge re-use) proposed by this the-sis. Question recommendation tries to ffnd and recommend the questionswhich share the same main topic(s) as the queried question but providedifferent aspects about the topic(s). To address both problems, this thesisproposes a novel data structure consisting of question topic and questionfocus to represent questions. And then on the basis of the data structure,a new language model is developed for question search and a novel methodof replacing question foci is proposed for question recommendation.3. Question utility: To further improve question search (or recommendation),this thesis proposes to study the problem of static ranking for questionsearch. More speciffcally, it proposes a measure called question utility forthe static ranking. Question utility characterizes how popular a questionis (i.e., how likely it is repeatedly asked by people). For the automaticevaluation of question utility, this thesis explores the techniques such aslanguage modeling, LexRank, and their combination. In addition, thisthesis also empirically proves the use of static ranking for questions search.4. Question interestingness: Questions at community question answering ser-vices can be voted and rated as'interesting'by users. These votes ex-press users'preferences and thus can be used as references for achievinga browsing-based re-use of question answering knowledge. However, usu-ally only a small proportion of questions eventually receive'interesting'votes. To overcome the data sparsity problem, this thesis proposes to studythe problem of automatically predicting question interestingness. Questioninterestingness is deffned as the likelihood that a question is considered'interesting'by users. To solve that, this thesis further proposes a new al-gorithm called the'majority-based perceptron algorithm'which emphasizesits training over data instances representing the preference of the majorityusers and thus avoids the inffuence of noisy instances.
Keywords/Search Tags:Community Question Answering, Extraction of Questions andAnswers, Question Search, Question Recommendation, Question Utility, Ques-tion Interestingness, Question Topic and Focus
PDF Full Text Request
Related items