Font Size: a A A

Mining Web Social Networks

Posted on:2010-12-15Degree:DoctorType:Dissertation
Country:ChinaCandidate:C LinFull Text:PDF
GTID:1118360302979300Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Social Network Analysis has been widely recognized as an important tool for understanding human behavior and analyzing social structure. As we are in the age of Web 2.0, more and more users join Web communities. With numerous users' content contribution and frequent communications and collaborations among them, Web has become a huge social network with volumes of social content. As a result, recent years have witnessed an emerging research trend on mining Web social networks. Research efforts on mining Web social networks have been proved to be helpful in capturing Web user behavior patterns, enhancing performances of Web applications (such as recommender systems, information retrieval systems, and public sentiment systems), bringing better user experience, and increasing working efficiency.However, mining Web social networks is challenging, due to the difference between virtual interactions among Web users and actual interactions among people, and the difference between Web content and traditional content. In general, the following reasons prevent researchers from fully exploiting Web social networks. First of all, Web social networks are implicit while traditional social networks are explicit. Secondly, the wide availability of social contents created by Web users offer abundant semantics. Thirdly, with various kinds of social interactions and heterogenous social actors, Web social networks are multi-mode networks. Finally, since users are encouraged to contribute contents, there are a mass of junk contents and junk links, along with diverse content qualities.Towards those challenges, this dissertation focus on mining Web text data to fulfill the goals of extracting implicit social networks, revealing semantics, identifying junk contents, measuring content quality and mining multi-mode social networks. Several techniques based on matrix, generative model, and Markov chain are proposed, and implemented on Web applications including expert search, junk identification and text clustering etc.The first part of this dissertation pays attention to mining social networks in Web forums.Threaded discussions are popular choices for Web users to exchange information, hence they have been employed in a wide range of Web applications, including Web forums, instant messages, chat rooms and Web logs(blogs) etc. Hence threaded discussions are valuable data sources for knowledge mining. This research addresses three aspects of mining large-scale social networks in Web forums.- User behavior analysis In Web forums, users interchange ideas and opinions with each other by posting comments and discussions. By analyzing the diverse posting behavior and social behavior of forum users, this contribution reveals that reply, knows and friend relations significantly affect interest and expertise diffusion in Web forums.- Modeling forum data based on matrix Semantics and structures are couple with each other in threaded discussions: replies indicate sharing of topics and vice versa. To model this property, a matrix based SMSS model is proposed to simultaneously model semantics and structures of threaded discussions. The model imposes two sparse constraints to force a sparse post reconstruction in the topic space and a sparse post approximation from previous posts. SMSS model is successfully employed in three applications including social network extraction, junk identification and expert search.- Modeling forum data based on generative models Inspired by the intuition of SMSS model, generative models are presented to model the semantics and structure of threaded discussions. In particular, a PLSA-style model is presented with a regularizer to extract the reply relationships; a LDA-style models are presented to distinguish junk topics and meaningful topics; user posting patterns are learned to leverage the quantity and the quality of related posts in ranking experts.The second part of this dissertation focus on mining multi-mode social networks. Towards the problem of mining experts in multi-mode networks, an ergodic markov chain model for multi-mode network is presented to discover experts. Mining experts in communities is studied to satisfy the personal information need in enterprise and academic environment.
Keywords/Search Tags:Social Network Analysis, Text Mining, Expert Search, Junk Identification, Threaded Discussions
PDF Full Text Request
Related items