Font Size: a A A

Application Of Web Mining Technology To Click Fraud Detection

Posted on:2012-08-13Degree:MasterType:Thesis
Country:ChinaCandidate:A C LiFull Text:PDF
GTID:2178330335474341Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development of the Internet, online advertising has become a new marketing tool. When many marketers of life through the colorful online advertising promote their products and brands, they also need pay for these ads, Cost-Per-Click advertising is easy and popular way of advertising billing, which is marked when the online advertising of web page is clicked and linked to relevant websites or details of the advertising. Click fraud is existed in Cost-Per-Click model of the online advertising industry, click fraud will be occurred when a person takes manual or uses computer program to imitate a legitimate user of web browser clicking on an ad's link, and who is not interested in the ad's link itself, but merely to gets some benefit. The emergence and proliferation of click fraud have hindered greatly the healthy development of the internet advertising industry.The purpose of this pager is to study the application of web mining technology to the click fraud of online advertising, This pager designed a click fraud detection model of online advertising based web mining algorithm, which the detection mode is referenced the methods of domestic and foreign research, and combined with Web mining outliers mining, multivariate linear analysis, timing analysis and etc.. Then systematic introduction to the detection system of the model. The detection system is divided into two steps:preliminary assessment, assessment modification. The preliminary assessment analyzed the data mainly based on the current click stream and the click stream of a short time, and then given a point of preliminary assessment of the click and feedback to the foreground. The main work of assessment modification is using Web Mining algorithms to correct and predict the preliminary assessment. In the data processing, first of all, the data need to preterit, because the collected data is regular, we need to do data cleaning, session identification, attribute selection, format conversion, normalization, etc, but since we collected server log data sets and script click stream, we also need to complete the task of data integration, complete and proofread the data sets. In the algorithm, firstly, we need to isolate the outliers, and then need a separate analysis for these outliers; the new incoming data need to run multiple linear regression analysis with historical data sets, the result of detection may be click fraud, and then feedback to the foreground. The foreground is relative to the server, including the site owners. advertisers and ad network.The detection model can detect or shield effectively various types of click fraud, and shield effectively the unconscious invalid clicks, and improve significantly the efficiency of click fraud detection based on no affect of the rate of ads showing. In this paper, several experiments were tested on the detection model; the experimental results were compared and analyzed. The experimental results also show that the proposed scheme could be effectively detected the click fraud of the persons who took manual or used computer program to imitate a legitimate user of web browser, the feasibility of the model and the effectiveness of the scheme is proved.Finally, the paper has described a brief summary for the contents of the paper, has prospected the development trend of click fraud detection, and has analyzed and discussed on the deficiencies of detective scripts, user identification, mining algorithms, follow-up analysis and etc, which will be the next steps.
Keywords/Search Tags:Click Fraud, Web Data Mining, Outlier, Online Advertising, Prediction
PDF Full Text Request
Related items