Font Size: a A A

Research On Network Traffic Anomaly Detection And Privacy Security Protection Based On Deep Learning

Posted on:2021-03-03Degree:DoctorType:Dissertation
Country:ChinaCandidate:X D YanFull Text:PDF
GTID:1368330605481249Subject:Information security
Abstract/Summary:PDF Full Text Request
With the development of new infrastructure such as 5G,artificial intelligence,big data and industrial Internet,the network has penetrated into every aspect of People's Daily work and life.But it also brings a lot of security problems,such as all kinds of viruses,vulnerabilities,attacks,can cause huge losses.At the same time,it also puts forward higher requirements for data protection and security protection,and how to guarantee privacy and data security has become the biggest challenge at present.One of the important benchmarks of network security is the ability to detect network anomalies.Intelligent methods such as machine learning and deep learning are widely used in abnormal flow detection.However,as the scale of network continues to surge,the massive and dynamic big data dimensions are also increasing,and the means of cyber crime are getting more and more hidden.As a result,the traditional security detection methods cannot meet the requirements of security attack and defense,and the risk information that can be mined is limited.In addition,the additional description information in the feature engineering method will increase the time complexity of the experiment and the complexity of the final model.It will also dramatically increase computing time and cause a "dimension disaster ".For privacy protection,the new attack based on the Generative Adversarial Networks(GAN)can break the protection method of collaborative deep learning,so that the training data set can be restored,and the privacy information of users can be disclosed.How to improve the availability of data on the premise of not disclosing sensitive personal information is a major problem faced by deep learning applications at present,and will greatly affect the future development of deep learning.This paper mainly studied the abnormal traffic detection technology in complex scenarios,the domain name detection technology based on bidirectional LSTM,the malicious domain name detection technology based on URL embedding,and the abnormal traffic detection technology based on collaborative deep learning.And the main work and innovation achievements are as follows:(1)For the problem of data skew in the process of big data analysis in massive data,and the problem of performance obstacle caused by the phenomenon of task timeout and memory overflow in the cluster,this paper proposes a classification algorithm of mini-bach gradient descent hinge based on adaptive learning rate and momentum.This is used to detect abnormal data and minimize the impact of security attacks.Compared with the traditional neural network,decision tree and logistic regression,this algorithm greatly improves the performance of deep network training in scale and speed,and minimizes the loss function of the whole training set to obtain the global optimal value.We also adopt asynchronous batch gradient descent algorithm to tune from the perspective of serialization and compression.Using the batch gradient descent algorithm to train the data subset can reduce the pressure on the parameter server and solve the problem of data skew in the big data Shuffle phase.By implementing the parallel framework of the algorithm,the processing speed of massive data can be speeded up and the burden of parameter server can be greatly reduced.(2)For the security challenges of massive network data and complex high-dimensional intrusion behavior characteristics,the traditional detection technology has problems such as insufficient modeling ability and "dimension disaster".Therefore,we propose a bidirectional LSTM based domain name error detection technology to improve the speed of domain name error detection on a large domain name collection.By studying the neural network of long and short term memory and the convolutional neural network,we can learn very complex functions by using hierarchical abstract ideas.In this way,it can better deal with a large number of complex high-dimensional data,improve the modeling ability,and improve the speed of error-planting domain name detection on a large domain name collection.However,most of the existing errant domain name detection work is based on the calculation of the edit distance between the domain name pairs.This will lead to the detection of short domain name is prone to produce a large number of false positive results.Although the collection of domain name related information to determine can improve the detection effect,but it will also introduce a very large cost.Using the lightweight detection strategy based on the domain name string and introducing bidirectional LSTM to make full use of the domain name context can improve the detection effect.In addition,the design of local sensitive hash function for domain name can improve the speed of errant domain name detection on a large domain name collection.At the same time,the shortcomings of the method based on edit distance detection are improved to effectively detect the misuse of misplanting domain name.(3)Aiming at the concealment and dynamic variability of malicious domain names,an unsupervised learning algorithm based on domain name embedding is proposed to replace feature engineering.It can effectively improve the extraction effect of malicious domain features,and then impro ve the detection performance.Machine learning algorithms can help us more easily identify abnormal information or malicious domain names hidden in a large amount of traffic.High quality features can greatly improve the performance of machine learning models.However,feature engineering tasks must be performed in memory,which will also cause the interference of human and subjective factors and the disaster of dimension.The model of deep neural network based on time series improves the extraction effect of the features of malicious domain names.We establish and store the mapping method between the URL and its corresponding distributed representation,and combine some key parameters to explore the URL embedding model.In this way,we solved the interference caused by human subjective factors and dimensional disaster caused by feature engineering,and effectively improved the performance of malicious domain name recognition.(3)For serious privacy leakage during collaborative deep learning training caused by generative adversarial networks attacks,we propose a privacy protection method based on deep convolution generative adversarial networks.It can effectively improve the protection effect based on GAN model against network attack.In the process of collaborative deep learning training,the privacy protection method based on deep convolution antagonistic generation network has a serious risk of information leakage.This method adopts the method of encryption transmission in the process of parameter transmission in the deep network.By adjusting the training parameters,the training based on GAN model attack is invalid,and the information can be effectively protected.Based on this,the stability of the privacy protection method based on deep convolution generation against network is improved and its effectiveness is verified by experiments.
Keywords/Search Tags:Collaborative Deep Learning, Generative Adversarial Networks, Anomaly Detection, Privacy Protection, Feature Engineering
PDF Full Text Request
Related items