Font Size: a A A

Research On The Method Of Microblog Data Acquisition And Analysis

Posted on:2013-09-21Degree:MasterType:Thesis
Country:ChinaCandidate:D T TianFull Text:PDF
GTID:2248330371459370Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
Microblogging is quickly developed into a new form of social network following blog. It has great influence on the field of information media. For the traditional form of social network, data acquisition and analysis technology has matured, but the microblogging network data acquisition and the research of microblogging network characteristics is still not perfect. This paper studies the characteristics and the effect of microblogging, and two microblogging data acquisition techniques. Using Sina microblogging for example, microblogging data acquisition and analysis system was designed and achieved, network characteristics of microblogging were simulated and analyzed. The main purpose is to analysis the characteristic of microblogging network according to the data obtained in microblogging. Specific work is as follows:1、Study on the technologies of getting data using web page crawler, including the basic principles and workflow of general web crawler, focused crawler, web pre-processing, text classification etc.2、Study on the workflow of getting the data using microblogging system SDK, this technology gets the user data by calling the API provided by the microblogging platform, and calling the API requires the user identity authentication. Currently, the major authentication is OAuth which is described in detail in this paper, and this method has simple steps and it can get microblogging data accurately and efficiently.3、The SDK program has been improved by several experiments to simplify the certification procedures, improve crawling efficiency and avoid duplication of crawling. The improved program can acquire data continually. The microblogging data fetched by this method is data set of researching microblogging network characteristics.4、Designe the framework of data fetching and analysis system. System database, function modules and interface were also designed. The basic functions of microblogging data acquisition and analysis were achieved. Using the system, microblogging network can be studied in-depth.5、Analysed microblogging network topology, the in-degree distribution and out-degree distribution, the conclusion is that microblogging network has small-world, scale-free and high clustering properties.
Keywords/Search Tags:Microblogging, Web crawling, Scale-free network, Degree distribution
PDF Full Text Request
Related items