Font Size: a A A

A Study Of Social Network Information Filtering Based On Probabilistic Graphic Model

Posted on:2017-03-22Degree:MasterType:Thesis
Country:ChinaCandidate:J B XingFull Text:PDF
GTID:2308330485453765Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Social network platform have been developing rapidly and generating mass of data recent years. Facing the explosive growth of social network data, people feel helpless. Information filtering is one of the useful methods for people getting rid of this situation. We can divide social network information into useless spam information and normal data information. Spam information not only lowers the quality of service, but also prevents the scientific development and research on social network. So it should be filtered. This processing operation is called non-personalized information filtering. However, personalized information filtering refers to getting useful information from normal network information and recommending to users. At the same time, information filtering improves the quality of service and the business value of social network. Indeed, the accurate models and effective prediction of the user interests are needed for personalization services. These researches are based on social network data, but these data are complex, multiple, uncertain. Probabilistic Graphical Models (PGM) takes advantage of probability theory and graph theory, which is highly advantageous to process indeterminacy. Therefore, in this paper our study mainly focuses on two typical methods, Latent Dirichlet Allocation (LDA) and Probabilistic Matrix Factorization (PMF), which belong to PGM for dealing with spam filtering and predicting user interests. Based on this research work, it can illustrate the advantages of this method and application prospect for processing complex social network data. Our contributions could be summarized as follows:First, we propose a novel approach for filtering the spam information which based on classification feature extension of LDA. The current methods are mainly based on classification of short text. These methods rarely consider enough factors such as sparseness, semantic information and background characteristics failing to get a good filtering performance. Along this line, in this paper, we apply the model of LDA for realizing characteristic expansion which combines with text category. After this, the problem of sparseness is solved. Then, the method will identify the background characteristic and reduce their weight in order to reduce the influence of classification. In the end, we use under sampling to deal with the problem of unbalance dataset classification. The experimental results show that the method can lead to a better filtering performance than current filtering methods.Second, we propose an approach for predicting user interests based on user social relations. While most previous research work attempt to analyze user interests only depending on user’s behavior history, similar interested users and so on. There almost have no studies focus on the prediction of user interests. Then, inspired by observation, the factors which impact user interests mainly contain user’s behavior history, social relations, the correlation among interests, the impacts of time information and the trend of social hub. Along this line, we use PMF to design a novel method predicting user interests. The experiment results on real microblog dataset demonstrate the correctness and effectiveness of our approach.
Keywords/Search Tags:Probabilistic Graphical Models (PGM), information filtering, Latent Dirichlet Allocation (LDA), user interests prediction, Probabilistic Matrix Factorization (PMF)
PDF Full Text Request
Related items