Font Size: a A A

Research On Key Problems Of Social Network Based On Sparse Learning And Social Theories

Posted on:2016-03-23Degree:DoctorType:Dissertation
Country:ChinaCandidate:X WangFull Text:PDF
GTID:1228330467495432Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the pervasion of social network, it provides a new platform to researchers in the field ofdata mining, and generates a series of novel research problems, such as short text processing,link prediction, network evolution and recommendation system and so on. Virtual socialnetwork is a microcosm of the real world among people, all phenomena are in accordancewith the laws of social development, and are reasonably explained based on sociologicaltheories. Therefore, the studies in this paper are based on the characteristics of social network,regard sociological theories as theoretic evidence, apply sparse learning model into dealingwith sparse data, noise data and missing data, and focus on a number of key problems ofsocial network. In this paper, the main contributions and innovations are as follows:(1) Research on Problem of Topic Identification based on Sparse Learning Model withLassoShort texts have become a popular mean of expression, through which users can easilyand freely produce contents on various topics. Due to the characteristics of short texts (shortcontent, semantic missing, data noise, etc.), short texts processing brings enormous challengesto natural language processing and data mining. In addition, user latent characteristics can befully reflected by his behavior data in social network, such as user interests and user emotions.The evolution of user interests or user emotions can be interpreted based on consistencytheory and contagion theory in sociological theories. The following two processes areproposed to explain social phenomena on popular topic categories: people are more likely tohave consistent preferences in a specific short time interval, which has been recognized aspreference consistency; people tend to influence others or their friends through a consequenceof interactions and feedbacks based on specific topic categories, which has been recognized associal contagion. Therefore, fistrly collect data from Twitter and Citation Network datasetsand construct user-user relationship matrix, and utilize t-test hypothesis testing to verify theexistence of consistency theory and contagion theory in topic identification of short texts.Then, extract text content from dataset, utilize traditional method of text feature selection toconstruct message-feature matrix, and transfer message-message correlation matrix fromuser-message matrix. The message-message correlation matrix is regarded as social contextand is integrated into analyzing text feature. Finally, propose a novel supervised learningframework STI, to tackle the short and noisy texts by integrating regularization terms ofconsistency theory and contagion theory into sparse learning model with Lasso, and identifytheir topic categories using the gradient descent method to optimize objective function. The experiments are designed to evaluate the framework STI on two datasets. The experimentalresults fully demonstrate that the proposed framework is able to effectively achieve topicidentification on short texts. The experiments also describe the impact of different socialtheories on accuracy of topic identification.(2) Research on Problem of Trust Prediction based on Social Status TheoryTrust, which is a major component of communication among people, plays an importantrole in experience exchanging, information sharing and consumer decision. However, trust isan abstract and complex concept of social psychology, and is affected by many factors. It isdifficult to identify its inducing factor and formation mechanism. In sociology, homophilytheory and social status theory are important theoretic evidences of constructing trust relations.Users with similar characteristics, such as similar background, simiar interests, commoncommunities and friends, are moe likely to construst trust relations. In community, there areobvious status differences between two users, user with low statuses are more likely to trustuser with high statuses. Since there are some studies about trust prediction based onhomophily effect, this paper focuses on trust prediction based on social status theory.Therefore, firstly leverage Epinions dataset to verify the strong correlation between trustrelations and status difference, namely users with lower social statuses are more likely to trustusers with high statuses. Then, propose a novel framework of trust prediction, sTrust, whichmodels social status theory as regularization term and incorporates it into non-negative matrixfactorization for trust prediction. The framework effectively solves the problem of datasparsity by low rank model. Finally, evaluate sTrust on two real-world datasets, such asEpinions and Ciao, to understand the importance of status theory in trust prediction.Experimental results demonstrate that the proposed framework sTrust can significantlyimprove trust prediction performance.(3) Research on Problem of Link Prediction on Signed NetworkCurrently, with the help of social network platform, users can generate trust and distrustrelations, construct friends and foes relations, and express support and opposition viewpointson forum. In real world, positive and negative relations are widespread in many fields, such asinternational relations, commerce trade and interpersonal communication. Signed network is akind of network including links with the sign property of positive or negative. The research ofsigned network was originated from the field of social psychology in1840s. With thewidespread and pervasion of social network, it brings more opportunities and novel problemsfor in-depth study of signed network, which link prediction is one of key problems aboutstudying signed network. In addition, sociological theories, such as interactional emotiontheory, structural balance theory and social status theory, are able to explain the sign propertyof constructing link relations, and provide theoretical principles for improving predictionaccuracy. Therefore, firstly collect topological data of link relations and behavior data of userinteraction from Epinions, explore some reasonable quantization strategies to measureinteractional emotion and social status, utilize density estimation method to verify the effectiveness of emotion theory and status theory on link prediction, and further assess thestructural balance of signed network on Epinons dataset. Then, on the basis of analyzing theimpact of three social theories on link prediction, explore interactional emotion theory toenhance the reliability of the decomposed matrix, and make up for the limitations of balancetheory and status theory. Finally, we propose three models about link prediction on signednetwork, namely MF-I, MF-BI and MF-SI, that model interactional emotion as enhancedreliability factor of matrix, model balance theory and status theory as the regularization terms,and incorporate them into sparse learning model of matrix decomposition. The experimentsare designed to evaluate the proposed models on Epinions. The experimental results fullydemonstrate that these three models are able to effectively achieve link prediction for signednetwork. The experiments also describe the impact of different social theories on accuracy oflink prediction.Since social network analysis is a cross-field multi-disciplinary research, there are manyissues worthy of study, In the future, we will focus on the problems of data sparsity problem,practical application with theoretical model, such as trust-aware recommendation and spam(spammer) detection.
Keywords/Search Tags:Sparse Learning, Matrix Factorization, Lasso Model, Social Theories, Signed Network, Topic Identification, Trust Prediction, Link Prediction
PDF Full Text Request
Related items