Font Size: a A A

Research And Implementation Of Data Acquisition System Oriented On Social Networks

Posted on:2017-02-18Degree:MasterType:Thesis
Country:ChinaCandidate:X B HuangFull Text:PDF
GTID:2348330518994791Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The emergence of social networks has greatly changed people's daily lives.It strengthens the connection between friends,reduces the cost of maintaining friendship,and increases the range of interpersonal relationships.In social networks,hundreds of millions of information is released and spread every day.Hidden behind the mass of information,there are a lot of decision-making information.Then how to capture these massive amounts of information is a problem that people need to solve.Sina Weibo is currently the largest social networking site and it is the best social network research object.In this paper,Selenium technology is used to design and implement a data acquisition system oriented on the social network.The traditional web crawler technology and its implementation principle is introduced in this paper,focusing on the analysis of the strategy of anti-crawler technology.Then the method of acquiring data through the open platform of Sina Weibo is studied.After experiment and analysis,the results show that the data can not be satisfied with the requirement of the system.So according to the characteristics of Sina Weibo social platform,Python+Selenium technology is used to achieve a social network of data acquisition system in this paper.The system has the function of micro blog account automatically log in,automatic collection of micro blog hot topics,micro blog content,comments and forwarding information.In order to guarantee the stability of the system,the system is optimized by the anti crawler strategy of Sina Weibo,including forging the User-Agent property,controlling the acquisition rate,using the proxy IP,automatic detection and switch the micro blog account,etc..Finally,the function and performance test is performed and the correctness and stability of the system is verified.
Keywords/Search Tags:social network, sina weibo, web crawler, selenium, data collection
PDF Full Text Request
Related items