Font Size: a A A

Web Advertisement Recognition Method Based On RF-DNN Hybrid Model

Posted on:2021-01-21Degree:MasterType:Thesis
Country:ChinaCandidate:W F ZhangFull Text:PDF
GTID:2428330623467342Subject:Control engineering
Abstract/Summary:PDF Full Text Request
Web advertising has always been one of the most influential elements in the browsing experience of users when surfing the Internet,and the presence of advertising will threaten the user's information security.Due to the unfriendly nature of advertisements for users,advertisement blocking software has developed rapidly.The current ad blocking software is mainly divided into two categories.One is to match the URLs and elements in the webpage through pattern recognition.It is represented by the most popular AdBlock Plus,which is generally more accurate in advertising recognition.However,it takes a lot of manpower to maintain the filter list,and it cannot be effectively identified for new or confusing advertisements;the other is based on feature recognition,which classifies the extracted advertisement features by classifier.Models,the accuracy of such method recognition is limited by the sum of the characteristics of the ad and the performance of the classifier.Based on the shortcomings of the above two methods,a method of webpage advertisement recognition based on RF-DNN hybrid model is proposed to make up for the shortcomings and combine the advantages of both.First,the web page is dynamically parsed based on the DOM tree to obtain an advertisement instance.Then,based on the different types of advertising features proposed in the network,a feature template is summarized.On this basis,the keywords in the EasyList filter list are added,and the advertisement instance is converted into a feature vector.Then the paper maps the random forest model optimized by the particle swarm optimization algorithm to the deep neural network,and successfully combines the user-friendly characteristics of the random forest with the accuracy of the deep neural network.Through the collection method,the accuracy of the neural network is improved.The input-output relationship obtained from random forest training is transmitted to the neural network,which improves the fitting speed in the neural network training process.Mapping the trained model to the neural network reduces the number of hyperparameters that need to be specified during the training process,which reduces the difficulty of neural network design.Then,a comparison experiment on the public data set proves the classification effectiveness of the proposed algorithm.Finally,the ad filter plugin based on this algorithm is compared with other plugins.Compared with other similar feature recognition-based plug-ins,it proves that the combination of feature template and classification model proposed in this paper has better performance.In the process of comparison with the pattern recognition based plug-in,it is proved that the method proposed in this paper does reduce the impact of the advertisement filtering list not updating in time on the recognition accuracy.
Keywords/Search Tags:advertising recognition, particle swarm optimization, random forest, feature template, deep neural network
PDF Full Text Request
Related items