Font Size: a A A

Design And Implementation Of User Relationship Analysis System Based On Weibo

Posted on:2017-03-10Degree:MasterType:Thesis
Country:ChinaCandidate:S B HuangFull Text:PDF
GTID:2348330488959926Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Today, people are used to express opinions, emotions, and interact with other people on social networking tool, which contains lots of valuable information. In these information, the user relationship is clearly the most important because the essence of social networking tool is to map the relationship between the people in the real world to the virtual space. Due to the commercial value of social network data, it is difficult for researchers to obtain them directly from the Internet Co. Therefore, in this paper, we design and implement a user relationship analysis system based on the most popular social networking tool, Sina Weibo, to a certain extent, which solves the above problems.Firstly, this paper focuses on how to collect users'Weibo data. There are three ways to get Weibo data:Sina Weibo API, web crawler and third party data provider. In view of the many restrictions on the Weibo API and the high fees of third party data providers, this paper uses simulation login and web parsing to implement a multi-account multi-thread Sina Weibo crawler based on Python. Then, this paper analyzes the data characteristics of the Weibo content, praise, forwarding and comments and discoveries that the volume is very large and the format is varied. So, this paper uses the NoSQL database MongoDB. Through setting up a MongoDB sharded cluster, this system realizes the distributed storage, which facilitates the storage expansion, distributed web crawler and parallel computing in the future. Finally, regarding the user relationship, this paper computes the users'attention to the blogger by weighted calculation of the users'praise, forwarding and comments under the blogger'blogs; if one user praises one or more blogs, he is treated as a supporter, otherwise, calculates his all forwarding'and comments'emotional tendency values and the sum of them is considered as the user's emotional tendency; takes some high attention users as the supporters and some low emotional tendency users as the opponents, then groups the supporters and opponents by the number of their common forwarding separately.This system uses the crawler to obtain a well-known blogger's all Weibo data. By using the above user relationship analysis strategy, the system gets the supporters and opponents of the well-known blogger. Then, the system gets their Weibo data and groups them separately. Finally, this system conducts a validation of user relationship and finds the result is credible.
Keywords/Search Tags:Weibo, Web Crawler, MongoDB, User Relationship, Emotional Tendency
PDF Full Text Request
Related items