| The development of social networks enriches the social relationship between people.As a mainstream social tool,Twitter contains a large number of active accounts and high-quality tweets published by these accounts.The classification study of Twitter accounts is beneficial for people to mine the accounts of interest from the massive Twitter accounts,and then handle the differences according to different account categories.In the existing research,the classification method of the Twitter account mainly uses the basic attribute characteristics of the account or the tweet text feature.These classification methods ignore the social network structure characteristics of the Twitter account,and the social network structure characteristics of the account can be reasonable.It reflects the basic social relationship of the account and the mutual influence between the accounts,which is beneficial to improve the effect of account classification.This paper uses the basic knowledge of the complex network domain to model the Twitter social network,and then conduct related classification methods research.The main work and innovations of this paper are summarized as follows:(1)According to different social behaviors in Twitter social networks,a heterogeneous network construction method is proposed.This method introduces five types of nodes: account number,tweet,event,character and hashtag,and mines the connection relationship between these nodes from the original tweet data,thus forming a different reflection of Twitter social relationship.Quality network.This heterogeneous network contains the social network structure features of the account,which is the basis for subsequent account classification tasks.(2)Based on the constructed heterogeneous network,a method of account classification for direct push learning is proposed.Compared with the inductive learning method,the direct push learning can not only utilize the known training sample data,but also utilize the clustering relationship between the unlabeled samples,which is suitable for the application scenario in which the training sample is insufficient.According to the different social behaviors of accounts in heterogeneous networks,this paper can extract different types of meta-paths to construct the influence relationship matrix between accounts,and then spread the tag information of known samples to other nodes in the heterogeneous network.(3)A heterogeneous network representation learning method is proposed.Heterogeneous networks contain rich node network structure features.Extracting these features and forming feature vectors of nodes can help improve the performance of account classification.This paper proposes a heterogeneous network random hopping algorithm,which is used to mine the context of nodes in a heterogeneous network,and then trains the representation learning model to extract the network structure features of the nodes.The heterogeneous network indicates that the learning method has a good effect on the account classification task. |