Font Size: a A A

Theory And Application Of Sparse Deep Learning

Posted on:2020-05-04Degree:DoctorType:Dissertation
Country:ChinaCandidate:J ZhaoFull Text:PDF
GTID:1368330602467986Subject:Circuits and Systems
Abstract/Summary:PDF Full Text Request
With the continuous great success of deep learning in many application fields,its application results also directly affect people's gradually cognition and understanding of artificial intelligence with deep learning as the core element.However,it is very difficult to study the deep learning theory behind these application results.At present,no matter in the field of engineering application or theoretical analysis,there are more and more researches related to sparse deep learning.In particular,with the diversity of the ways in which the sparse characteristics are integrated into the deep network,the effective computing model of sparse deep learning has achieved remarkable results in practical application,but there are still many difficulties in the research and application.The research of sparse deep learning study includes the following six aspects,one is to use the classic stack ideas,these deep model can be interpreted by stacking the shallow interpretability model usually can meet good interpretability,but the differentiability and stability of the model is poor,and in some complex visual tasks,its generalization performance still need promoted;Secondly,sparse deep learning still uses the stochastic gradient descent strategy with error back-propagation as the core idea to optimize network parameters.Although some practical optimization techniques can alleviate the problem of gradient disappearance,in essence,it is still a problem to be solved to design an efficient optimization algorithm to avoid local extremum and saddle point.Third,although sparsity is conducive to the compression of a deep network,how to use sparse deep learning to further explore the nature of the over-fitting deficiency problem is a difficulty in current research.Four is that embedding ways in a deep learning model of sparse style are various,although sparse models has many advantages,but the excessive sparsity can often lead to the poor stability of the model,which further results in the decrease of generalization capability of feedforward neural networks,how to reasonably embed the sparsity in the deep learning is introduced to solve the problem of the stability of the network model is one of the difficult points of research;Fifth,how to analyze the generalization performance and robustness of the network by using the sparse characteristics(such as attenuation characteristics)of the hidden layer output in the sparse deep learning becomes a problem to be solved;Sixth,with the deepening of the network level,the effective information for the reconstruction task is constantly lost or abandoned.How to design a sparse deep learning model for the decomposition and reconstruction task is one of the current research difficulties.In addition,as is known to all,the classical deep differentiable system relies on the gradient descent algorithm based on error back-propagation and has achieved great success,which is a qualitative leap from the traditional machine learning model in terms of both generalization performance and model stability.At present,the deep differentiable system still cannot give a reasonable answer to the interpretability of the model systematically.In this context,this paper makes systematic theoretical research and analysis on some of the difficult problems mentioned above.More specifically,the theoretical and applied creative contributions are as follows:1.For the optimization of network architecture and model,a fast sparse deep neural network is proposed to provide an alternative training method for the learning and optimization of deep neural network.The design of the network includes the following two aspects:first,the closed form solution corresponding to the convex optimization of the hidden layer is used to give the parameter learning of the hidden layer,which is different from the error back-propagation algorithm using iterative update strategy.Another approach is to approach the output target by using multilevel linear summation,which is different from existing deep neural networks.In particular,fast sparse deep neural networks can achieve good generalization performance without fine tuning.2.A sparse deep combination neural network is proposed for few shot learning tasks.Its advantage is that the hierarchical optimization mechanism can independently solve the convex optimization problem to realize parameter learning for each hidden layer.The network framework can be divided into three parts:one is to generate samples using the combination mechanism based on Info GAN;Secondly,data learning is adopted to solve the complexity of samples.Thirdly,the sparse deep combination neural network is used to calculate the multipath layer rapidly and efficiently.In addition,the design of this network is based on the idea of extreme learning machine.The experiment has proved that:based on the sample combination mechanism of Info GAN,the quality of generated samples tends to be better and better as the number of combinations increases.3.Different from the unsupervised layer-by-layer learning method,considering the layer-by-layer supervised method for pre-training,a sparse deep stacking network framework is proposed.The framework includes a sparse deep stacking extreme learning machine and a sparse deep tensor extreme learning machine.For the sparse deep stacking extreme learning machine,the design of the network is expanded along two parts:first,inspired by the extreme learning machine,a sparse single hidden layer multi-path extreme learning machine is designed,which has the advantages of relatively few hidden nodes and high generalization performance at a relatively fast speed.Secondly,the sparse single hidden layer multi-path extreme learning machine is formed into a sparse deep stack extreme learning machine by stacking.For the sparse deep tensor extreme learning machine,the number of hidden layer parameters is effectively reduced by using tensor operation,so as to help the network achieve higher generalization performance.4.In order to make full use of the category prior information to improve the discrimination ability of features on each hidden layer in the deep network,a sparse deep discriminative neural network model is proposed,whose purpose is to form a more compact feature representation layer by layer and class by class.Specifically,we use dictionary for learning and sparse representation classifier respectively to improve the discrimination ability of each hidden layer in the sparse deep neural network,and the discrimination ability of hidden layer features is reflected in the consistency within the class and the difference between classes.Compared with the existing deep stacked autoencoder network and deep belief network,the proposed network has faster algorithm speed and convergence characteristics,and the discrimination ability of hidden layer makes the generalization performance of sparse deep network more competitive in a variety of classification tasks.5.In order to design a sparse deep learning model for decomposition and reconstruction mechanism,a sparse deep differential neural network is proposed.Compared with the classical deep learning system,the hierarchical abstraction features are of certain relevance,so it is difficult to understand and explain the deep learning network framework as a whole.And we propose a sparse deep difference network framework for the first time to introduce the concept of differential characteristics,through the optimization study each module MDL,replaced the abstract usually features as input effectively express the classic learning mode,makes the whole or the end-to-end network for easier interpretability analysis on evolution analysis of the localization.In addition,this design method can be conveniently extended to the classical deep learning system,which is different from the Mallat algorithm of traditional linear decomposition and reconstruction.The introduction of the concept of hierarchical difference feature provides a way of thinking of nonlinear decomposition and reconstruction for the deep learning system,and provides another effective expression of input.
Keywords/Search Tags:sparse deep learning, extreme learning machine, sparse representation, pattern classification, difference feature
PDF Full Text Request
Related items