Font Size: a A A

Credibility Analysis Interactive Quiz Answer Community

Posted on:2014-12-18Degree:MasterType:Thesis
Country:ChinaCandidate:R H WuFull Text:PDF
GTID:2268330422965712Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In recent years, with the development of Web2.0, the user is not only the web contentviewer, but also the web content editors, which results in a lot of User Generated Contentnetwork applications, Question Answering Community (QAC) is generated in this contextof network applications.The basic model of QAC is that the user in community asks aquestion according to his own need, then the other users give their answers according totheir comprehension about the question. Because of the huge variation from person toperson, the answer credibility varies from small to large, which has different effects to thequestioner and question browsers. So the answer crediblility problem is a major problem inQAC. Based on the above concerning, we mainly focus on the answer crediblility analysisand divide the issue into three parts: the multiword expressions extraction in questions ofQAC, answer credibility classification in QAC and the most credible answerdiscrimination.First, the study of multiword expressions extraction in questions of QAC. Multiwordexpression extraction is mainly used to parse the questions and bulid the credibleinformation library. According to the characteristics of multiword expressions in thequestions, we propose a method of extracting MWEs in questions of QAC. In this method,we first use mutual information method and stop words filtering method to get thecandidate MWEs. Then we classify the candidate MWEs into four types: right string,incomplete string, redundancy string and error string. At last, with the help of queryoptimization in search engines and the candidate MWEs retrieval results on the internet,we design a revising method to get the MWEs. We take the questions in Sina iask questionlibrary as the experimental corpus and the results show that the precision, recall and theF-measure can reach to84%,52%,0.64, which proves the effectiveness of the proposedmethod.Second, the study of answer credibility classification. For the characteristics of theQAC, we propose two new features: textnormalization feature and feature of uncertaintytone to classify the answers, which can classify the answers form more perspects. Usingthe Logistic Regression model, we combine the new features and classic text features,statistical features and user features to analyze the answer crediblility. Taking the sina iaskquestions of healthy domain as the experimental corpus, we find the new features can improve the result of answer classification.Third, the most credible answer discrimination study. We put forward the method ofbuilding credible information library, and propose the method of applying credibleinformation library and the question answer features to discriminate the most credibleanswer. We first choose the credible question-answer pair and the credible material relatedto the question as the key components. Then, we associate the two parts in a proper way,which provide a great convenience for the future use of the library. After that, we presentone of the library using methods and prove the feasible of using the credible library todiscriminate the most credible answer. By using the presented discrimination method, weget a better performance comparing to the traditional method.
Keywords/Search Tags:QAC, multiword expression, search engine, answer crediblility analysis, credible information library
PDF Full Text Request
Related items