Font Size: a A A

Research On The Method Of Social Network Data Collecting And Analyzing

Posted on:2016-02-24Degree:MasterType:Thesis
Country:ChinaCandidate:J ZhaoFull Text:PDF
GTID:2308330461951480Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Social network allows users to exchange and communication at anytime and anywhere. Especially with the popular of smart phones and 4G networks, Social network attracts more and more users, and has became one of the main methods at information dissemination. With the development of cloud computing and big data technology, the academic and industry pay more attention on the analysis and mining of social network data, and data collection is one of the key aspects of the data analysis. In order to research the information propagation model and users’ influence maximization on social network, this paper designs and implements a simple and universal social network data collection system through the data acquisition system research. Based on the data collected, the paper mainly analyzes the social network user preferences. It mainly studies the following aspects:Firstly, this paper introduces the social network data acquisition system. It introduces two methods of data acquisition: the web crawler and the open API interface. It deeply studies OAuth authorization authentication, XML and JSON parsing techniques, No SQL databases and other related technologies in the data acquisition process.Secondly, this paper designs and implements a social network data acquisition system. In this paper, we use BFS mode crawling social network data through the openly API interface. And use multi-accounts and multi-threading to the control the frequency of request, improved the efficiency of the crawl. It uses Na?ve Bayes method to solve the spam micro-blogging filter problem, and uses Hash Map method to solve user repetition problem, and uses page technology solve data integrity problem. the paper uses the two methods of Mongo DB and text files to store the huge amounts of data, and uses data pruning method for data preprocessing.Thirdly, this paper analyzes the user preference in the social network. It uses TF-IDF(term frequency-inverse document frequency) to calculate the item weight, and uses VSM-based(Vector-Space Model) user’s preferences to get the feature vectors of users and subjects, and computes the similarity of user preferences.Finally, this paper analyzes the data of social network through the experimental. It crawls the tencent weibo data through the method of random sampling. It ensures the randomness of data and the feasibility and effectiveness of the data acquisition system. Through the analysis of the data in the data set, it verifies the validity of the user preference modeling, and also verifies the social network is a scale-free network.
Keywords/Search Tags:Social network, Data collection, Data Storage, Data preprocessing, User preference analysis
PDF Full Text Request
Related items