Deep learning is a new research direction in machine learning to approach to its original goal- artificial intelligence. Deep hierarchical network model included multi-hidden-layer is constructed to study the essential characteristics and indicate level characteristics of data samples,the accuracy of identification and prediction of unknown data can be improved by the characteristics. Handwritten numeral feature extraction is an important technique for automation equipment to distinguish handwritten numeral, which has a very wide range of applications in industries such as the bank and postal. In this paper, depth learning algorithm was applied to feature extraction of handwritten numeral set. It is a theoretical and practical subject, which has a certain value of theoretical research.The main works of this paper are as follows:1. A feature extraction algorithm for handwritten digits based on sparse autoencoder model was proposed in the paper. Multiple sparse autoencoder was stacked to construct the deep network structure. Based on the feature of the handwritten numeral set extracted by the stacked sparse autoencoder model, the digital recognition was carried out by the softmax classifier. The whole training process was divided into two stages: the layer by layer greedy pre training and the inverse propagation and random gradient descent parameter tuning. This model was simulated on the MNIST data set in MATLAB. The essential characteristics of data is extracted effectively in this model, and the recognition accuracy rate is more than 98.5%.2. An adaptive learning rate method was proposed to solve the slow learning rate problem of Stochastic Gradient Descent. In this method, the learning rate is determined dynamically according to the variance of the error in the last two iterations.3. Dropout was applied in this model. Neuron output of hidden layer was reset by 50% probability and the weight was kept not to be update.The improved model was simulated on the MNIST data sets, the accuracyof identification was improved by more than 0.6% after the Dropout applied into the test sets. Over-fitting was prevented effectively in this model. |