Font Size: a A A

Research Of User Identification Across Social Networks

Posted on:2021-08-28Degree:MasterType:Thesis
Country:ChinaCandidate:F YuanFull Text:PDF
GTID:2518306476453384Subject:Computer technology
Abstract/Summary:PDF Full Text Request
User identification across social networks takes an important role in research of social networks,which aims to utilize users' multi-dimensional information to determine whether two accounts on different social networks belong to the same user.The task can effectively integrate users' information,which not only can help us understand users better,but also can know the regional culture,economy and politics behind the users.Thus,it provides a huge boost to the development of world e-commerce and the global economy.The task of user identification across social networks has gained lots of attention and progress in recent years.However,as the privacy policy has been more stringent,the various privacy information that was widely used before on user identification across social networks is difficult to obtain.At the same time,the existing user identification classifiers based on traditional machine learning algorithms cannot effectively mine the correlation of features,which leads to the inability to further improve the performance.Therefore,aiming at the problems above,an in-depth research on this task is studied and user identification system across social networks is implemented.The main research includes the following three aspects:First,user identification feature extraction framework is proposed merely using user tweets text information,which is easy to obtain.The core idea of the framework is to fully tap the users' features from the perspective of users' tweets content,the local perspective and the global perspective of writing style.The framework maps user's tweets text information to the high-dimensional feature space,and then models the similarities between users by similarity layer.The experiments on real-world data prove that the user feature extraction framework based on user tweet text information can contain more effective information.Second,based on the similarity features mentioned above,a new classifier based on self-attention and CNN is proposed to judge whether different accounts belong to the same user.The correlations of different features and different similarities are excavated by self-attention mechanism to enhance the representation ability of similarity features.Then,the local correlation between features and similarities is captured using CNN.The experimental results show that the classifier based on self-attention and CNN can effectively improve the performance,and also prove the effectiveness of self-attention mechanism in this task.Third,a user identification system based on the two-stage user identification framework is proposed.Specifically,considering users' differences of behavior on social networks,users are classified as ordinary users,disguised users and hidden users.In order to efficiently solve identification problems of various categories of users,we utilize username information,which is easy to access,to build the first stage classification model,and then uses the method based on text messages of user tweet as the second stage classification model.The experimental results show that the two-stage framework can effectively improve the efficiency of user identification.Finally,based on the algorithm and framework mentioned above,the system of user identification across social networks is designed and implemented.
Keywords/Search Tags:User Identification, Social Network, Self-attention, Two-stage Framework, Writing Style Identification
PDF Full Text Request
Related items