Font Size: a A A

Research And Implementation Of Feature Selection Algorithm Based On Autoencoder

Posted on:2020-01-02Degree:MasterType:Thesis
Country:ChinaCandidate:W HuangFull Text:PDF
GTID:2428330623951439Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Deep learning is an important research direction in the field of artificial intelligence.Deep learning stems from biological neural network science.It combines low-level features to form more abstract high-level representation attribute categories or features to discover distributed feature representations of data.Stacked autoencoder models and sparse autoencoder models commonly used in the deep learning field.The two models can utilize the data learning features to obtain the feature expressions of different levels of data and improve the classification accuracy of the model.However,the irrelevant features learned by the autoencoder models consume a large amount of computing resources and storage resources during the neural network training process.In order to reduce the time of neural network training,feature selection can be performed on the feature set learned by autoencoder.Feature selection techniques can not only help to understand the training model of automatic coding but also improve the generalization ability of the model.Therefore,this paper applies the feature selection technique to the feature set learned by the automatic coding model to obtain better classification performance.First,the paper describes the workflow of deep learning and related models.Then the paper discusses the relevant technologies used in feature extraction and feature selection.Finally,the paper proposes stacked autoencoder models based on cross entropy and denoising autoencoder model based on Squared Error.The main exploration work of this paper is as follows.In order to solve the problems about feature engineering in machine learning,obtain the deep essential feature information of the target,and improve the recognition accuracy,this paper proposes a stack automatic coding feature selection algorithm based on cross entropy(CESABF).CESABF adopts two common methods in machine learning.Two common methods are supervised learning and unsupervised learning.First,a three-layer autoencoder model with three autoencoder is established,and the three-layer stack autoencoder model is trained by unsupervised method.Then,the output of the three-layer stack autoencoder model is taken as the input of a softmax classifier,and supervised training method is adopted to adjust the parameters of the model.Then the paper calculate the influence of each feature on the datasets inthe above model with cross entropy as the standard,and delete the feature that increases the cross entropy of the datasets.In order to enable the autoencoder to learn the sparse features of the data,reduce the redundancy of the feature set,and improve the generalization ability of the network model,this paper proposes a sparse autoencoder feature selection algorithm(SESABF)based on the squared error loss.The sparse autoencoder model is generated to solve the over-fitting problem and improve the generalization ability of the model.Sparse autoencoder model is based on autoencoder model,which adds the sparse restriction to the datasets and makes the features learned by the model robust.First,a classical autoencoder is created,which adds the sparse restriction to the autoencoder and encodes the data automatically.Take the encoded datasets as input to a softmax classifier and fine-tune it using supervised training.Then,squared error was used as the evaluation standard to measure the characteristics,and the influence of each feature on the squared error of the datasets was calculated,so as to retain the characteristics to reduce the squared error of the datasets.
Keywords/Search Tags:deep learning, autoencoder, feature selection, cross entropy, squared error
PDF Full Text Request
Related items