Font Size: a A A

Information Diffusion Pattern Analysis And Popularity Prediction For Online Media

Posted on:2019-06-15Degree:DoctorType:Dissertation
Country:ChinaCandidate:B ChangFull Text:PDF
GTID:1318330542497991Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Due to the prevalence of Internet applications,online media has become a major way to receive and share information.Many Internet applications can be considered as networks,such as social networks and competition networks among companies.The flow of online media data among network nodes will result in information diffusion.On the one hand,these data may contain the diffusion of malicious information,such as rumors and deceptive advertisements.On the other hand,most applications are open and allow to share data and information with others,which will form a positive-feedback effect of information diffusion.Therefore,exploring the information diffusion laws in networks will not only help to capture its mechanism,analyze group behaviors and avoid malicious dissemination of information,but also benefit to theoretical researches in related fields including sociology and management.To this end,for information diffusion analysis and application of online media,this dissertation aims to describe the information diffusion between nodes,detect the information source and predict with information diffusion in networks.It focuses on three aspects:recruitment demand analysis with latent confounding factors,information source detection in networks and online serial popularity prediction with information diffusion.Specifically,First,it studies the problem of online recruitment demand analysis,from the per-spective of describing the information diffusion between nodes with latent confounding factors.Due to the competition of talents and business,companies makes up com-petition networks with them as nodes.With the careful observation from real-world recruitment data,each company's recruitment demand is influenced by three latent con-founding factors,namely individual business requirements,homogeneity with competi-tors and industry trends.However,traditional methods of recruitment demand analysis neglect the competition networks and don't take the above three factors into account.Thus we proposed a solution to recruitment demand analysis with latent confounding factors,called TMRDA.It is a novel unsupervised generative topic model,and inte-grates the aforementioned three confounding factors into the generation of job postings on the word level.Meanwhile,we introduce an effective method to capture the prior knowledge of homogeneity with competitors and industry trends from recruitment data.TMRDA can enable many intelligent applications and in-depth recruitment analyses,such as recruitment demand forecasting and market competition analysis.Extensive experiments on real-world recruitment data clearly validate the effectiveness and inter-pretability of our model in terms of recruitment demand analysis.Second,to find the information source,this dissertation explores the problem of information source detection.When the information is spreading among people,de-tecting its source based on observed diffusion results is important for epidemic out-break prevention,Internet virus source identification and rumor source tracing in social networks.Although related works have noticed the importance of this problem,their methods are still deemed inadequate due to their high computational complexity and yet-to-be-improved effectiveness.Therefore,we derive a Maximum A Posteriori(MAP)estimator to detect the single information source in undirected and weighted graphs.Different with existing methods,it applies other simple but effective methods as the prior,and exploits both infected and their uninfected neighbors to get an effective prop-agation probability.After that,it infers the exact likelihood formation of the observed infected subgraph.For better efficiency,we also design two approximate MAP esti-mators,namely Brute Force Search Approximation(BFSA)and Greedy Search Bound Approximation(GSBA).BFSA tries to enumerate the corresponding permitted permu-tations to derive the approximate likelihood,while GSBA exploits a strategy of greedy search to reduce the computational complexity.Experimental results on several data sets also clearly validate the effectiveness of our methods on detecting the single infor-mation source with different settings in weighted graphs.Finally,it exploits information diffusion to predict the popularity of online serials.When a user finds an interesting online serial such as teleplays and web fictions,he may share this serial with his friends in a social network.If his friends are influenced by him and like it,they will also try to experience this serial.The above process will repeat such that a positive-feedback effect of information diffusion is formed,which improves the popularity of this serial in turn.However,most previous methods focuses on predicting the popularity of online contents from time series prediction,and neglect the above process.Therefore,we take teleplays as an example,and first introduce a straightforward yet effective Naive Autoregressive(NAR)model based on the correla-tion between popularities of adjacent episodes.But it neglects user watching behaviors and the information diffusion with other applications.So we further propose a Transfer Autoregressive(TAR)model.It assumes that audiences of an episode consist of two parts,i.e.,followers and freshers.In addition,as a derivative of the TAR model,we also design a novel metric,namely favor,for evaluating the quality of online serials.Experimental results on two real-world data clearly show that our models are effective and outperform baselines in terms of the popularity prediction for online serials.
Keywords/Search Tags:online media, information diffusion, recruitment demand analysis, information source detection, popularity prediction
PDF Full Text Request
Related items