Font Size: a A A

Research And Implementation Of Mobile App Promotional Attack Detection

Posted on:2023-05-14Degree:MasterType:Thesis
Country:ChinaCandidate:Q GuoFull Text:PDF
GTID:2558306914479264Subject:Information security
Abstract/Summary:PDF Full Text Request
With the rapid development of mobile internet,the number of mobile apps has increased explosively.To attract more users,some developers are willing to violate the rules and implement promotional attacks(PAs),including fake downloads,fake comments,and other attacks.The prevalence of PAs has caused mobile app stores to be eroded by fake data,which causes unfair competitions.Such attacks not only disrupt the normal network order and cause a vicious circle of "bad money driving out good money",but also increase the risk of users being infected with malicious apps and may even damage the security of mobile devices.Therefore,promotional attack detection problem has become a key issue that needs to be solved urgently in mobile app stores.Recently,researchers have extracted mobile app features based on semantic analysis,graph analysis and other technologies,and used machine learning classification algorithms to detect PAs.However,with the continuous increase of data volume and the upgrade of PA patterns,the existing detection schemes based on supervised learning are slightly insufficient in detecting unknown patterns,and it is difficult to cope with the escalating attacks.To this end,this thesis proposes a PA detection scheme that combines semi-supervised and supervised learning algorithms.The main research results are as follows:(1)To fully learn the characteristics of PA attacks,this thesis designs app features from four dimensions:popularity,metadata,developers,and reviews.In the process of feature extraction,for each time window,we extract the popularity features(including five features,such as app ranking,the number of new downloads)in each timestamp and stitch them into time series in the order of collection time.The popularity time series is used to describe the dynamic changes of users’ attention for a given app in a given time window.Meanwhile,we extract metadata,developer,and comment features corresponding to the detection time stamp and stitch them into a thirteen-dimensional background information vector,which is used to represent the context of a popularity time series.(2)Given the problem that the existing supervised learning methods cannot detect unknown patterns,this thesis proposes an elimination method based on semi-supervised learning,which uses an improved CVAE model to screen out potential PA apps.To this end,we firstly construct an app promotion whitelist,then build a training dataset based on this list,which contains popularity time series and background information vector of apps.By learning the distribution of normal promotion apps,the improved CVAE model can calculate reconstruction probability of input data,then screen out potential PA apps.To enable the CVAE model to fully learn the in-depth features contained in the popularity time series,we take the traditional CVAE model as the main architecture,replaces the inference network and the generation network with LSTM,and introduces a constrained BN layer to make the KL divergence have a lower bound that greater than zero in the training process.So as to ensure that the generative network can learn more information from hidden variables during training and solve the problem of KL vanishing.(3)A CNN-LSTM based fusion detection model is proposed to further detect PA apps from potential subjects.At this stage,considering the advantages of the CNN-LSTM neural network structure in processing time series data,this thesis combines the network of CNN and LSTM to fully extract the local correlation and temporal correlation of the popularity time series.After merging with background information feature,SoftMax function is used to give a two-class judgment.Compared with the existing supervised detection methods,our supervised detection model extracts the local features of time series by introducing a one-dimensional convolutional neural network,which improves the detection accuracy of abnormal promotion behavior patterns.Moreover,the weight sharing feature of convolutional neural network can effectively shorten the training time of the model.(4)Finally,to combine the advantages of two models and make full use of the limited dataset,a model training strategy,based on under-sampling technology,is proposed.During the training process,the improved CVAE model is used to remove the negative samples that are far from positive samples.By increasing the proportion of samples which is near the classification boundary between positive and negative samples,the proposed supervised learning model can maximize the learning ability.Then we select all positive samples and negative samples with the top 90%of reconstruction probability to construct a new dataset for next round of model training.Through multiple iterations,the best match of improved CVAE model and proposed supervised learning model is obtained.The experimental results show that our method,which combines semi-supervised learning and supervised learning,is superior to the existing detection methods in detection accuracy,efficiency,and ability of detecting unknown PA patterns.Specifically,our method has a detection accuracy of 96.36%,a recall rate of 95.39%on the test dataset.We believe that the above methods can meet the need of mobile app stores and help them curb unfair competitions.
Keywords/Search Tags:mobile app, promotional attack detection, ranking fraud, abnormal exposure, neural-network algorithm
PDF Full Text Request
Related items