Font Size: a A A

Research On Account Matching Method Across Social Media

Posted on:2020-05-11Degree:MasterType:Thesis
Country:ChinaCandidate:L B YangFull Text:PDF
GTID:2428330596976082Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
In recent years,with the rapid development of mobile communication technology,social media has also experienced explosive growth.People use a variety of social media sites for different purposes and leave their own unique "digital footprint" on the web when using these social media services.Mining user data in social media can gain a lot of valuable information that is useful to social network control and management.However,social media data is usually discrete and fragmented,resulting in poor performance of only mining a single type of social media data,so if it is possible to integrate multiple social media data belonging to the same user entity through cross-social media analysis,data will be better helpful to control and manage social networks.Account matching methods across social media which can identify accounts of the same user on different social media is the basis for cross-social media data analysis.However,the traditional methods of using the classification model to determine whether several accounts belong to the same personal entity ignore the PU(Positive and Unlabeled,PU)problem of datasets,resulting in poor generalization ability of the classifer model.In addition,the methods of extracting account features face the dilemma of feature sparseness and difficulty in effectively utilizing social relationship features,which make traditional methods unable to be widely used in actual social network data analysis.This thesis studies the practical cross-social media account mapping method for the above problems.It is aimed to quickly discover social media accounts belonging to the same person entity from a large amount of accounts through a small amount of information opend to public on social media.In order to solve the data sparsity problems of account attributes,this thesis extracts the features from two kinds of account attributes,such as username and friendship of accounts,and achieves results in the following two aspects:(1)This thesis proposes a fast account matching algorithm based on username features.This method extracts relative features from multiple user names of two candidate accounts,and uses a classifier to determine whether the accounts match.Aditionally,an active learning method is proposed to improve datasets and solve the PU problem of datasets,which improves the accuracy and generalization ability of the classifier.Finally,the pairs of accounts which are most likely to match are obtained by making use of social relationship,the calculation amount in the account matching process is reduced by this method,and the fast account matching is realized.This method is highly accurate and suitable for most social media platforms.(2)This thesis proposes to use network representation learning technology to extract social relationship features Using the random walk based network representation learning technology to learn potential features of relationships between accounts and vectorize accounts,and kernel trick technique is used to solve the mapping problem of node vectors in different network spaces,and the process of vector mapping and similarity calculation are united under the classifier model.The method has a high recall rate and can be applied to large-scale datasets through a parallelized implementation,and has high scalability.
Keywords/Search Tags:across social media, account matching, network representation learning, nonlinear kernel support vector machine, PU classification
PDF Full Text Request
Related items