Font Size: a A A

User Credit Profiling Techniques For Online Users With Big Social Data

Posted on:2018-03-27Degree:DoctorType:Dissertation
Country:ChinaCandidate:G M GuoFull Text:PDF
GTID:1318330512985619Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In recent years,with the rapid growth of mobile internet and social media technolo-gies,people have moved from BBS,Blogs to Twitter and Facebook-style websites like Weibo for networking,learning and entertainment.Meanwhile,as internet and social media applications begin to seep into our everyday lives,tons of User-generated Con-tent(UGC)are accumulated on the social Web.Compared with traditional Web data on emails,blogs,and BBS etc,social media data is more diverse and timely.Specifically,Weibo has become the major public platform for us to publish events,interact with oth-ers,and spread information.As a social platform which is open,simple,real-time,and large-scale,Weibo-style social data have also become the standard research testbed,at-tracting tremendous attention from academics as well as researchers in industrial labs.To fully leverage this data source,lots of studies have been conducted on social data for studying social network theories,user behavioral patterns,event detection and analysis,rumor identification and prediction etc.To sum up,social media big data contains valu-able information,knowledge that are worthy of data mining.However,social data's characteristics like short length,low quality,rapid change,and weak correlations also pose unique challenges and problems for traditional data mining methods,nullifying various previously successful techniques.To address the above issues and achieve the goal of social big data based user profiling,this dissertation employs the Weibo dataset as the testbed for technique de-velopment.Specifically,we focus on three subtasks including high-utility sequential data mining algorithms,targeted finding of latent user behavior dimensions,and fea-ture engineering and learning based personal credit scoring.Aside from that,based on the previous studies,we survey recent advances on user profiling with social data,and provide a categorization and possible future directions for social data based user pro-filing.To be specific,1)Weibo data is presented as timeline,which is actually a kind of event based sub-sequence data,i.e.,episodes.To find high-quality Weibo data,only episode frequency is not enough.This dissertation proposes several strategies to opti-mize traditional high-utility episode mining algorithms so as to reduce the running time and memory consumption significantly.2)For each tweet in Weibo data,it contains not only the length-limited text,but also contextual behavioral information.Simply com-bining these two kinds of data with feature engineering techniques is not enough for subtle user attribute inference.This dissertation proposes to learn latent user behavior dimensions(LUBD)with text and behavior data as input simultaneously.Experimental results show that LUBD-CM is superior to LDA and simple Naive Bayes model with a large margin for personal credit scoring.3)As we all know,the self-reported de-mographic information,social network structure available on the social data are also important data sources for user profiling.However,simply designed features are not strong enough to feed standard classifiers like SVM,Decision Trees etc.To leverage the multi-source user-generated content for user credit prediction,this dissertation proposes to design social features from both domain knowledge and empirical observations,and perform two-level feature learning for harnessing the hidden evidence among features.In this way,this dissertation implements social data based personal credit scoring frame-work with both stacking and boosting techniques,and provides surprising insights into how people behave differently from each other.It is worth noting that although these algorithms are evaluated with user credit as the target attribute,other attributes like age,gender,location can also be used as the targeted attributes.
Keywords/Search Tags:Social Big Data, Social Network, Weibo Dataset, User Profiling, Credit Profiling, Event Sequence Mining, Topic Models, Feature Learning
PDF Full Text Request
Related items