Font Size: a A A

Active Learning And Adversarial Deep Learning Algorithms Based On Data Manifold Structure

Posted on:2018-04-12Degree:MasterType:Thesis
Country:ChinaCandidate:W B GuoFull Text:PDF
GTID:2428330590977628Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Active learning is a kind of framework.It has to be combined with some certain supervision learning method.In this thesis,we mainly focus on support vector machines(SVM)and deep neural networks(DNN).Integrating SVM with active learning can reduce the need of labeled data and decrease the training cost.Feeding deep neural networks into the framework of active learning aims to improve the resistance of deep neural network to adversarial samples.However,these integrations encounter certain problems.To be more specific,as to active learning SVM,the main problem are that this algorithm is sensitive to original situation and still waste information of data structure.As to active learning DNN,state-of-the-art method can not provide enough security guarantee.Attackers can still craft model specific adversarial samples to attack the target model.To solving the aforementioned problems,from the prospective of data manifold structure,we did the following research in this thesis:(1)In order to make full use of the information containing in unlabeled data,an improved active learning SVM is proposed.Before the active learning process,spectral clustering algorithm is applied to divide the dataset into two categories,and instances located at the boundary of two categories are labelled to train the initial classifier.In order to reduce the calculation cost,an incremental method is added to the proposed algorithm.The algorithm is applied to several text classification problems,with the results of being more effective and more accurate than the traditional active learning algorithm.(2)An novel active learning SVM with low-rank representation(LLR)subspace clustering is proposed to solve the sensitive initial states problem.At the initialization step of proposed active learning algorithm,a representation of data is obtained by applying LLR to the whole dataset,which get rid of the error and noise in dataset.Then subspace clustering is applied to the dataset based on an affinity grape built from the representationmatrix.Data points lie in the sparse region of two clusters are selected to be the initial support vector.After the initialization,active learning SVM is conducted to classify dataset into two classes.Experiment results of several standard binary classification datasets indicate that the proposed method has a higher accuracy then other state-of-theart methods and eliminate the influence of different initial state.(3)An active learning SVM based on low-rank transformation(LRT)for binary classification is proposed to take best advantage of the data distribution characteristic and information contained in labeled data.In each iteration,before updating the classifier model,we use the labeled data samples chosen by select engine to derive a transformation matrix.Then,we project the whole dataset to a union of subspaces by learnt matrix,where data samples pertaining to different classes are lying in nearly orthogonal subspaces of the original data space.After that,SVM are adopted to renew the classifier.The transformation conducts in the same kernel space as SVM.As the iteration goes on,more data are labeled,which makes it easier to explore the intrinsic structure of dataset and get a better classification performance.Experiments on several standard datasets indicate the proposed algorithm achieve a better performance than other classification algorithms.(4)DNN been proven to be quite effective in many applications such as image recognition and using software to process security or traffic camera footage,for example to measure traffic flows or spot suspicious activities.Despite the superior performance of DNN in these applications,it has recently been shown that a DNN is susceptible to a particular type of attack that exploits a fundamental flaw in its design.Specifically,an attacker can craft a particular synthetic example,referred to as an adversarial sample,causing the DNN to produce an output behavior chosen by attackers,such as misclassification.Addressing this flaw is critical if a DNN is to be used in critical applications such as those in cybersecurity.Previous work provided various defense mechanisms by feeding DNN into active learning framework.However,after a thorough analysis of the fundamental flaw in the DNN,we discover that the effectiveness of such methods is limited.As such,we propose a new adversary resistant technique that obstructs attackers from constructing impactful adversarial samples by randomly nullifying features within samples.Using MNIST and CIFAR-10 datasets,we evaluate our proposed technique and empirically show our technique significantly boosts the robustness of DNN against adversarial samples while maintaining high accuracy in classification.The results of applying proposed method to malware classification also show better resistance and classificationperformance than state-of-the-art method.
Keywords/Search Tags:Manifold Structure, Active Learning Support Vector Machines, Adversarial Deep Learning, Random Feature Nullification
PDF Full Text Request
Related items