Font Size: a A A

Third-Party Tracking And Anti-Adblocking Detection System Based On Machine Learning

Posted on:2020-10-15Degree:MasterType:Thesis
Country:ChinaCandidate:J X SunFull Text:PDF
GTID:2428330602952181Subject:Engineering
Abstract/Summary:
With the continuous development of modernization,in order to enrich the content of the web page and attract more users,many websites embed third-party applications in web pages[1].Third-party applications can beautify the webpage and enhance the users'interaction with the webpage.However,there are also some third-party applications that infringe on the users'private information while serving the web page.They analyze users'preferences by collecting users'browsing history and users'series of interactions on the web page to create advertisements that are appropriate for the particular user.There are also third-party appli-cations that try to understand more private information about users,including pregnancy and birth,and repairing bad credit,etc.In order to protect users'private information,researchers and government agencies have proposed a series of protection measures.One of the most effective ways is to use a blacklist to block third-party applications with tracking behavior.Third-party applications with tracking behavior typically exist in web pages in four ways,including web bugs,iframes,JavaScript files,and Flash files[2,3].Flash files play a very important role in third-party tracking.This paper develops a system DFTrackerDetector,which mainly detects third-party Flash files in web pages to see if they have tracking be-havior.Flash files call different ActionScript APIs depending on the way they behave and for different purposes.Therefore,DFTrackerDetector takes ActionScript APIs as features,extracts features using static analysis methods,and finally builds a classifier using machine learning algorithm and automatically generates a list of trackers.DFTrackerDetector has an accuracy of 94.73%in the test set.Advertisements play an important role in the development of the Internet.Websites will leave certain positions on the page to display advertisements and earn an expense so that users can get free service at the terminal.However,due to the driven by interests,some web-sites have abused advertisements.Some advertisements are placed in very obvious positions that affect users'normal reading,and some advertisements track users'information resulting in the disclosure of users'private information.As users continue to grow their awareness of protecting their privacy and in order to get a clean and safe browsing environment,more and more adblocking plugins are installed in browsers.This behavior of users has seriously affected the business model of the online advertising industry,which has forced advertisers to implement a series of countermeasures against this behavior.Some websites participate in the so-called“acceptable advertising program”so that the website's advertisements are not blocked by the adblocker and can be fully displayed on the web page.There are also websites that deploy JavaScript files on web pages to detect if adblockers are installed in the users'browser.Once adblockers are detected,JavaScript files make a series of responses,some JavaScript files require users to completely disable adblockers or whitelist the site,and some require users to make a donation to normally browse the webpage.In response to ad-vertisers'counterattack,adblockers are also detecting and filtering anti-adblockers.In order to quickly generate a list of anti-adblockers,this paper proposes a machine learning-based anti-adblocking detection system called ABDetector.JavaScript files with anti-adblocking behavior and JavaScript files without anti-adblocking behavior will call different JavaScrip-t APIs.ABDetector uses the JavaScript APIs as features to build the classifier.Different from other feature extraction methods,this paper uses dynamic analysis methods to extract features from JavaScript files.The accuracy of ABDetector on the test set is 81.46%.
Keywords/Search Tags:Web Security, Privacy, Flash, Third-party Tracking, Anti-adblocker, JavaScript, Machine Learning
Related items