| With the development of information technology,online social networks such as Twitter,LinkedIn and Weibo have become more and more popular.As one of the world's largest online social network,Twitter has become an important platform for foreign netizens to express their opinions and share information.Twitter has lots of active daily accounts,who will publish or forward a large amount of text information.Usually,the text posted by the account is closely related to the hobby and personal life of the account,and the account will also pay attention to the accounts who share the same hobby,and interactions such as forwarding,mentioning,etc will be more frequent..Based on the above situation,the classification problem of the social network account uses the text content published by the account and the relationship information between the accounts to realize the account classification,so as to achieve the purpose of personalized recommendation of a specific category account and identification of a malicious account.This paper takes the Twitter account as the research target from the perspective of the text content published by the account and account relationship to study the account classification problem in Twitter.The main research contents are as follows:1.For the text representation problem of the account,bases on the distributed word embedding model word2 vec and considering that the traditional word embedding is unsupervised,this paper proposes a semi-supervised account text embedding representation model Semi-User2 vec.The text information is mapped into a dense low-dimensional vector,which generates an account text feature vector with tag information,and then uses the input of the classifier,Support Vector Machine(SVM),to classify the account.2.For the fusion of text and account relationship,considering that account of the same type in social network will frequently mention each other in their tweets,this paper extracts the mentioning relationship from tweets and build the network.Referring to Word Mover's Distance,with semi-supervised text embedding Semi-User2 vec as input,this paper proposes a way to calculate the neighborhood similarity of social network.Considering the characteristics of the neighborhood similarity,this paper alsoproposes a method based on ensemble learning that can fuse the text feature and neighborhood similarity for the account classification.3.In view of how to use multi-dimensional relationship to classify accounts,this paper deals with the data of Twitter and builds a multi-dimensional network—mention,forward,and friend network.Based on the current emerging neural network method,this paper proposes Multidimensional graph convolutional network,and based on the graph convolutional network mechanism,using the attention mechanism in the traditional neural network to achieve the fusion of multiple relational networks for the classification of Twitter accounts. |