Font Size: a A A

Internet Users Data Mining And Behavior Analysis

Posted on:2015-03-30Degree:MasterType:Thesis
Country:ChinaCandidate:L J LiuFull Text:PDF
GTID:2268330425988963Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
With the continuous development of the Internet and users’ requirements, researches about behavior analysis and data mining of Internet users are developing rapidly. As a typical example of Web2.0, Internet forum undertakes the role of information dissemination and public-opinion guidance. Therefore, modeling and forecasting interests of forum users not only do help to properly analyze user’s interest, but also contribute to providing personalized service to users. The heat of BBS posts forecast is of great significance for grasping public opinion trends.Firstly, some commonly used data mining algorithms and user interest models are briefly introduced in this paper, then the data set of TianYa BBS is processed. On the basis of the above processing, this paper designs an interest weight updating algorithm suitable for BBS users and effectively predicted user’s interests. Then, the paper does analysis on the influence characteristics of post heat and forecast the probable hot posts.User’s interest weight update algorithm is based on forum access time interval and the number of posts and replies during the interval, because there is a large gap in user’s access time interval. The designed update algorithm takes into consideration both user’s access time interval and post number as the important weight variables. In the interest forecast aspect, this paper designs a two-stage user interests clustering algorithm. Through forum dataset simulation, experiment results verify the effectiveness and accuracy of user’s interest updating algorithm and interest prediction algorithm.Forum post heat is affected by many factors. According to the relationship of BBS users, there are friendship and attention relationship between users and empirical value of individual user, so we extract the nature and relationship of users as one aspect of influence. As the degree of audience is closely related to post’s content, we take post content as an important influence aspect. Besides, post’s release time is of a certain degree importance to its heat degree, so we also take time factor into consideration. On the basis of analyzing post’s heat impact characteristics, this paper does SVM regression on post heat and achieves satisfying results.In the last part of this paper, the user interest modeling and hot posts forecasting are applied to the network of public opinion analysis, and a forum based user behavior analysis system is designed. The whole system is constituted by several modules, including data acquisition, data preprocessing, user’s behavior analysis and data storage. The system is responsible for implementing user interest identification, access time statistics, discovering active users and opinion leader, predicting hot posts, and so on. The detailed design of each module is presented. This paper also builds the framework of this system as a basis for future system implementation.This work has been supported by the National Natural Science Foundation of China under Grant61172072,61271308, and Beijing Natural Science Foundation under Grant4112045, and the Research Fund for the Doctoral Program of Higher Education of China under Grant W11C100030, the Beijing Science and Technology Program under Grant Z121100000312024, and Beijing Municipal Commission of Education Discipline Construction and Graduate Construction Project.
Keywords/Search Tags:Internet, User Behavior Analysis, Data Mining, Interest Model, Network Consensus
PDF Full Text Request
Related items