Font Size: a A A

Study And Its Application On Classification Algorithm Based On Deep Learning

Posted on:2017-02-19Degree:MasterType:Thesis
Country:ChinaCandidate:T WuFull Text:PDF
GTID:2308330482995634Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With rapid development of the information technology, our live is full of kinds of data, which contain abundant and meaningful information for mining. The classifier is one of the most popular methods for data analysis, which builds classifier model by training samples, and then predicts the class of the unlabeled sample. For example, the classifier can check the spam according to a large number of normal email, and it can also identify the network attacks through learning the normal internet traffic. However, some situations like spam and attack behavior in the real life are accidental events or small probability event. These data are difficult to collect. So the classifier is usually trained and described by the normal data which are easy to collect. The model is built to distinguish them and the outliers, in order to achieve the purpose of predicting the class of data. This kind of problem is defined as the one-class classification problem with imbalance data.Support Vector Data Description(SVDD) is one of the best-known one-class-classification methods to solve problems where the sample data are of high dimension but limited amount. Yet the results of SVDD can be greatly affected when the target data are poorly distributed and their density varies extremely. An improved SVDD algorithm, SA_SVDD algorithm, is introduced in this paper s which combines AP clustering algorithm and SVDD. The main procedure of SA_SVDD is as follow.Firstly, AP clustering is applied to the training set to obtain a set of compact subclasses. And the boundary of each subclass is identified by SVDD Then these boundaries are used as final classification criteria. During this process, an improved Particle Swarm Optimization(PSO) is employed to optimize all the parameters of SVDD self-adaptively. Hence the proposed algorithm only need a sample set as the input, and the parameters in the algorithm is generated adaptively. SA_SVDD is evaluated by some benchmark datasets and indicates the significant effectiveness with respect to other one-class-classification methods.In order to reduce the computation and increase the accuracy, the one of deep learning algorithms, sparse automatic coder(SAE), is introduced in this paper to reduce the dimension of the data. Then, we propose an improved SVDD algorithm, termed as SAE_SVDD, which combines SAE and SA_SVDD. The process of SAE_SVDD is first to use SAE to compress features of the data, or reduce the dimensions of the data, and decrease the sparsity of the data; and then use SA_SVDD to classify the data after dimensionality reduction. The research background of this paper is based on the pregnant with prenatal depression survey. The dataset was obtained through the questionnaires, with doctor’s professional diagnosis, and the data were divided into the healthy pregnant and the pregnant with depression. T he number of patient samples is very small, while the number of healthy samples is extremely large. What’s more, the dataset has a large number of features. Therefore, the running time is too long and the classification effect is poor when we use the SA_SVDD on the above dataset, but the SAE_SVDD can effectively solve the above problems. At the same time, evaluations using massive experiments on the UCI datasets show that both the running time and classification performance have significant improvement.
Keywords/Search Tags:Deep Learning, Support Vector Data Description, Parameter Optimization, Self-adaptive
PDF Full Text Request
Related items