Data Acquisition Technology Of Microblogging Public Opinion Research System

Posted on:2015-06-07

Degree:Master

Type:Thesis

Country:China

Candidate:T G Lv

Full Text:PDF

GTID:2298330431487547

Subject:Computer Science and Technology

Abstract/Summary:

With the rapid development of mobile Internet, more and more information ispublished to the Web. Network information reflected the intentions of people has beenaffecting as well as inciting Internet users. Thus, network consensus is receiving anincreasing popularity at present. Government departments should predict, detect anddredge public opinions in order to develop a healthy Internet environment. Because ofthe rapid development of microblogging, more public opinion events have exposed onTwitter for the first time. Microblogging is playing a critical role both in governmentdepartments and enterprises.Our paper analyzes and researches problems of data collection from Microblogging, and raises a method that is the same as page login to solve the problems.Then, we use priority queuing method to capture more influential users of microblog.Firstly, we carefully analyze the current ways such as Web crawlers and methodbased on microblog API used to crawl data from web, and find that those two methodscould not meet the demand of current public opinion system both in size and real-timerequirements. Therefore, we propose ways simulating browser login to crawl the pagedata with easy access and high speed to get data from any microblog users.Secondly, we take big data problems into account and build microblog usernetworks to solve the problem. We have built huge microblog user networks byabstracting microblog user to a point, and fans, attention, forward and reviews to aline between two points, which would help us to discover new microblog users andensure data integrity.Finally, we efficiently get data from the web by using priority queue algorithms.Efficient data collection means that we collect data according to web user influencethat we firstly collect data from the high influence user, and then collect the data fromthe smaller influence users. This paper uses calculation model for the priority.Influence was calculated according to userâ€™ fans number, attention, activity,communication and timestamp. We also collect data from non-active users bycalculating time intervals. In order to effectively analyze the collected Web pages, wedesign parsing program that can posit information directly through the characteristicvalue without resolving the label. With the "clean" microblog data, we obtained someinteresting information after simple analyses.Experimental results show that the method not only is versatile and complete without manual intervention, but also can obtain high quality data with higher speed.

Keywords/Search Tags:

Microblogging Data, Analog Login, User Network, User Influence, Priority Queue

Related items

1	Research On User’s Influence In Microblogging
2	Key Technology Research Of User Influence In Social Network
3	Research On Community Detection Based On User Influence In Microblogging
4	Analysis Of User Influence Based On Topic Diffusion In Microblogging
5	Design And Implementation About User Login And Data Analysis In The Internet Of Things Base On PHP
6	Topic-Field-Specific User Influence Research And Implementation In Microblogging Platform
7	Research On Prediction Method Of User Influence And User Dynamic Behavior Of Online Review Website
8	Research And Implementation Of User Influence Analysis Model Based On Topic
9	The Microblogging Information Filtering Based On User Analysis
10	The Evaluation Of User Influence In Online Social Network Sites With Noise Existence