Font Size: a A A

QQ Space Data Research And Analysis Based On Scrapy Crawling

Posted on:2017-03-18Degree:MasterType:Thesis
Country:ChinaCandidate:B YangFull Text:PDF
GTID:2428330602461009Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of wireless network and web2.0 technology,the mobile terminal users with Internet access and interaction is becoming more and more convenient.QQ space is a large social networking platform,users can share personal opinions and information in the personal space,interact self-emotion.Calculation and analysis of QQ space data can provides an important scientific basis for marketing,psychology,sociology and urban planning and development in the areas of research.From the perspective of statistics,this paper researches and analyzes the mobile phone users through the QQ space data mining,firstly,we obtain the QQ space data based on the library of Scrapy crawlers,and then we analyze the mobile phone brands and user behavior on the basis of the data set.Our goal is to explore the different brand mobile phone users,users schedules change and emotional state.The main work includes the following several aspects:1.Due to the network large volume of data and complex data structure,sample technique of big data becomes an indispensable link in the data acquisition and analysis.In this paper,a kind of sampling algorithm based on the topological hierarchical model is used to sample the big data,the algorithm mainly uses the data structure of graph theory and similarity principle for large data sampling,through the experimental simulation of large data sets,we found that the algorithm has good sampling,which can keep good relationship network similarity between the sub-network and the original network.2.This paper uses Scrapy crawler frame as the data tools to crawl on the QQ space data.We analyze the format of QQ space page,and then combined with the sampling method,and execute the crawlers in 30 days on PC machine,finally get the QQ space data set.This paper detailed discusses the simulation of web crawler login identity,QQ space network packet analysis,sampling analysis of network data and save the data of database design,and the data capture and storage of the problems in the process of nature are analyzed,and the final solution is given.3.This paper analyzes the mobile phone brand of users,work and rest time and user emotion analysis with the QQ space data.In this paper,the analysis results reflects the distribution proportion of mobile phone brand in different populations,the user with the change of the work and rest time,and emotional intensity,these conclusions not only provide the scientific basis for the research areas of sociology,marketing,and of psychology,but also provides support for government's decision-making.
Keywords/Search Tags:Data mining, Web crawler, Scrapy, QQ Space, Sample model, User's behavior analysis, Sentiment analysis
PDF Full Text Request
Related items