Font Size: a A A

Design And Implementation Of Twitter Data Collection And User Profiling

Posted on:2020-10-04Degree:MasterType:Thesis
Country:ChinaCandidate:Y H HanFull Text:PDF
GTID:2428330578980936Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of information technology,social networks have gradually become the main platform for people to communicate and entertain,and social network data has also shown explosive growth.Compared with the data generated by traditional computer technology,social network data has more distinct characteristics:timeliness,huge scale,rich types and fast transmission speed.However,social network data also has some problems such as data clutter and low value density.How to find useful information from massive social network data,and depict social network users in an all-round way has become an urgent problem to be solved.In order to mine the valuable information from social network data,we design and implement the twitter data collection and user portrait system.The system can collect so-cial network data in real time and stably,mine the hidden attributes and social attributes of users,and depict social network users in all aspects.This paper mainly studies data collec-tion and character portrait technology based on social network,and implements the system.Specifically,the main work of this paper is as follows:(1)We design a data acquisition technology based on web crawler and developer API,and establish a secure and stable data collection technology system,which can break through the anti-climbing mechanism and authority authentication of social networking sites to achieve continuous and stable collection of social network data.We improve the efficiency of data acquisition through multi-threading and asynchronization.We build a social network da-ta acquisition system,which allows customize the data collection methods and data types.The system can format and store data according to data types,and provide a data collection webservice interface for other applications to call.(2)We establish a social network character portrait model,define the character descrip-tion norm which describe social network users through basic attributes,implicit attributes and social attributes,and realize the accurate description of the user's implicit attributes and social attributes through mining methods.We establish the user portrait system,which can automatically perform user portraits according to social network data,and format the por-trait results for storage and output.In addition,although the system uses Twitter as the data collection object,the mining method used in this paper is also applicable to other social network data(such as Weibo,Facebook,etc.).(3)We construct a sample library of typical characters.The data in the sample library comes from the collected social network data.Through cleaning,screening and manual labeling,the sample library contains representative characters in various fields.We define the evaluation indicator to evaluate the accuracy of the portrait model based on the portrait result of the typical character.
Keywords/Search Tags:social network, typical characters, data collection, user portrait, Twitter
PDF Full Text Request
Related items