Font Size: a A A

Robustness Of Semi-supervised Learning Algorithms

Posted on:2017-09-01Degree:MasterType:Thesis
Country:ChinaCandidate:Y J LiFull Text:PDF
GTID:2348330536453090Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of computer science and technology,there are a great number of unlabeled samples on the network and real world.Therefore,it is easier to collect such unlabeled samples than labeled samples.As a result,the traditional supervised learning algorithms cannot satisfy the needs of current situation.So,how to make use of these unlabeled samples to improve the performance of learning machine is a hot topic in the machine learning research field nowadays.Semi-supervised learning mainly considers how to use a small number of labeled samples and a large number of unlabeled samples for machine training and classification.On the one hand,it is of great significance that semi-supervised learning is able to reduce the cost of manual annotation and improve the performance of learning.However,on the other hand,the robustness of semi-supervised learning algorithm in the adversarial environment cannot be ignored.Only in this way,it can be widely used in practical applications.Existing robustness analysis methods are mainly proposed for supervised learning algorithms in the adversarial environment.However,analysis about both unsupervised learning and semi-supervised learning is not fully explored.Thus,in order to design an effective semi-supervised algorithm against the adversarial environment,we need to find corresponding attack methods.Then,we can further put forward corresponding effective defense algorithms against attack methods.However,it is still a big challenge.In this paper,firstly,we briefly introduce the background of machine learning and semi-supervised algorithms,and five semi-supervised learning algorithms.Secondly,we mainly focus on discussing the robustness of self-training and co-training algorithms.By analyzing both models of algorithms and experimental results,we find that:1.Semi-supervised learning algorithm is essentially a process with constant iteration,and terminates at a certain condition.2.In the process of iteration in self-training,we find that the accuracy curve remains in a stable value firstly,then rise rapidly and reaches a peak value finally with the increase of the number of iterations.3.The two child classifiers of co-training algorithm may label the same unlabeled samples by two different labels.Thus,there is an uncertain area.For each sample in this area,the label marked by the combined classifier which is constructed from the two child classifiers also is the most uncertain.4.In each iteration,only samples distributing on both sides of the classification boundary will affect the accuracy of the classifier.In addition,the closer the sample is to the classification boundary,the more important to determine the distribution of the classification boundary.Based on the above four characteristics,we propose two causative attack methods according to the self-training algorithm and the co-training algorithm respectively.According to the characteristics of the self-training algorithm,only attacked samples near the classification boundary are selected and added to the training set when the accuracy curve rises.According to the characteristics of the co-training algorithm,we select samples located near the classification boundary to attack.Then,we propose our attack method : USNB.Our strategy is to move the attacked samples by a regulation rules.Thus,we can reach our aim to reduce the accuracy of the classifier by making wrong decisions.Our experiments are designed based on the sampled data.We consider that in intrusion detection applications,there is a malicious attacker who tries to attack the classification system.In this case,the system does not know what the invasion is.From the attacker's point of view,we put forward an effective attack strategy.It is the first step and the foundation to put forward an effective defense strategy in the future.Finally,we verify that our method has strong attack ability by experiments,and it has a more promising result than current attack methods.
Keywords/Search Tags:Semi-supervised Learning, Self-training, Co-training, Adversarial Environment, Robustness
PDF Full Text Request
Related items