Font Size: a A A

Mining Weblogger's Interests And Study On Informative And Affective Weblog Content

Posted on:2009-06-19Degree:MasterType:Thesis
Country:ChinaCandidate:X C NiFull Text:PDF
GTID:2178360242476779Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the growing of Web 2.0 heat, World Wide Web has entered a new phase. As a representative application of Web 2.0, weblog is developing fast, which drives the development of the whole WWW industry. In recent years, the dramatically increasing amount of weblog has made it a new source of information. How to organize, search, and make good use of and mining useful information from the abundant weblog resource are arousing both academic and industrial communities'interests. The success of those tasks will help to grip the developing trend of WWW, refine online services, enrich user's online living and increase user experiences. Those desires have significant values for both practice and research.Making use of the personalized and diverse features of weblog content, this paper deploys the analyses and study work on weblog content, including the following two topics.Identify webloggers'personal information to construct their interest set by mining corresponding weblog content. This paper proposes a text classification based approach to automatically mine weblogger's interests. In this approach, the technique of combining classifiers for text classification is used to increase classification accuracy and the reliability of mined interests. In addition, a variant top-down level-based hierarchical text classification technique is used to mine much specific interests, which also can show mined interests in a conceptual hierarchical structure. This task will benefit many WWW related research and applications, such as personalized search, automatically recommendation of news and advertisement and the construction of user social network. This paper arouses the informative and affective identifying problems for both weblog articles and weblog, and proposes to use text classification techniques to solve those problems by considering them as classification tasks. This paper examines the applicability of existing text mining techniques including classification algorithm and feature selection algorithm. According to the experiments, this paper finds that the combination of Support Vector Machine classifier and Information Gain feature selection algorithm can induce the best performance. Moreover, based on above task, this paper proposes three applications, including emotion and topic classification of weblog article, intent-driven weblog search and browsing system and the recommendation of high informational weblog. Those works are all the hotspots of weblog related research. This task has significant impact on the development of weblog related research and applications.
Keywords/Search Tags:Weblog, Interest, Informative, Affective, Text Classification
PDF Full Text Request
Related items