Font Size: a A A

Research Of Community Detection In Community-based Question And Answering Systems

Posted on:2015-03-26Degree:MasterType:Thesis
Country:ChinaCandidate:X N FengFull Text:PDF
GTID:2268330428999882Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Community-based Question and Answer (CQA) systems,such as Yahoo! An-swer, Baidu knows, provide people with free answers to questions by integrating public intelligence. However, CQA systems have no explicit "community" struc-ture, which would play an important role in many applications. Furthermore, CQA systems are vulnerable to spam accounts farming, because CQA systems can be accessed by Internet users and search engines, which decreases the quality of knowledge in CQA systems rapidly.Considering the above problems, it is necessary to analyze characteristics of CQA users’behaviors and user networks. In the meanwhile, researching results can make contributes to handling the following problems, such as detecting spam accounts, providing personalized services etc.This thesis focuses on "Baidu Knows", the largest CQA system in China, analyze CQA users’behaviors, and studies its social network structures. By exploiting the question-answer interactions among the users, two networks are constructed and shown to have strong social network characteristics. In addition, interest-oriented user communities can be observed on the networks. Furthermore, we propose Multilayer Speaker-listener Label Propagation Algorithm (MSLPA), an improved variation of SLPA, to detect user communities in CQA networks. MSLPA’s performances are evaluated from aspects of network size, community topics, clustering, and hierarchy. Comparing with existing algorithms, MSLPA can effectively detect the genuine, overlapping, hierarchical communities in which users share common interests, and avoid forming large number of tiny communi-ties.Community detection technology is applied in identifying spam accounts in CQA systems. To detect spam accounts, we propose one group of account’s properties having high discrimination:including accounts’ individual properties computed by statistical analysis, and accounts’community structure computed by community detection. By applying proposed properties in facile J48classifier, experimental results show that these nice properties have good performance and the classification accuracy is improved.
Keywords/Search Tags:Community-based Question and Answering System(CQA), CommunityDetection, MSLPA, Social Network Analysis, Spammer Detection
PDF Full Text Request
Related items