Font Size: a A A

The Application Of Continuous Hidden Markov Model In Click Fraud Identification

Posted on:2014-01-21Degree:MasterType:Thesis
Country:ChinaCandidate:W J HeFull Text:PDF
GTID:2248330392461301Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
With the vigorous development of search engine keyword advertisingmarketing mode, click fraud has became a big problem to plague advertiserand search engine companies. How to identify and prevent click fraud hasalso became the focus of all the famous scholars’ research. In this paper,weanalyzed the click fraud problems of online keyword advertising and itsbehavior characteristics. As the click behavior of keyword advertising is inline with the basic assumptions of HMM, we are trying to apply the theore-tical framework of HMM in click fraud identification.The main work of this article is as follows:Firstly, HMM is just a theoretical framework model.We analyzed thebehavior patterns of keyword click, built a CHMM model for the onlinekeyword advertising of search engines.Then we established the recognitioncriteria for click fraud.Second, we trained a specific CHMM model through actual data,andidentified the effects of this model. The statistical results shows that theidentification of click fraud through CHMM has a high accuracy rate.Third, we analyzed the recognition accuracy when the parameters ofthe model get different values. And we determined the optimal number ofhidden state(fixed value) and the other variable parameters.Fourth, due to the factors such as periods and emergencies, the clickrate of online keyword advertising should have improved visibly, but it’snot caused by click fraud. So,we established a dynamic CHMM model byupdated the historical data for training and generate new parameters. Thismode can reduce the impact of such factors to accuracy of identification. At last, the parameter estimation is very important for the model toachieve high accuracy. Baum-Welch algorithm has many defects.Compared with Baum-Welch algorithm, the Segmental K-Means(SKM)algorithm can not only reduce the computational complexity,but also get afaster convergence rate. Moreover, the SKM algorithm is more focused onthe model output mode automatically classification recognition. Therefore,SKM algorithm is more targeted to click fraud problem. The result of dataanalysis also shows that SKM algorithm can get a higher accuracy rate forclick fraud identification. In addition, this paper preliminarily explored theHMM parameter estimation method based on Gibbs sampling method ofMCMC.
Keywords/Search Tags:Click fraud, CHMM, Baum-Welch algorithm, SKMalgorithm, MCMC
PDF Full Text Request
Related items