Parameter Choice For Boltzmann Machines:Theories And Applications

Posted on:2017-06-07

Degree:Doctor

Type:Dissertation

Country:China

Candidate:X C Zhao

Full Text:PDF

GTID:1318330515467069

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Deep learning models have drawn increasing attention due to their impressive empirical performance in various application areas.Despite of these practical successes,there have been debates on the fundamental principle of the design and training of those deep architectures.In practice,researchers need constant trial and error to set the network structure and control the training process.Since there is a lack of theoretical guidance,the design complexity has become the bottleneck for more widely applications.The focus of this paper is the general parameter choice(or reduction)in neural networks: utilize as few parameters to retain as much information of the probability distribution,in order to improve the model's computational efficiency and generalization capability.In this paper,we focus our study on Boltzmann machine(BM)for two main reasons:1)BM has been widely used as basic components in deep learning models;2)The information geometry(IG)theory provides a unified perspective and analytical tool for BM.Based on IG,we formalize the parameter choice problem as the optimization problem to maximally preserve the geometry structure of statistical manifolds.Specifically,the main innovations of this paper include:1.We propose a general parameter choice criterion for the family of multivariate binary distributions.Based on IG,we define the relative importance(called confidence)of a parameter as its contribution to the expected Fisher-Rao information distance within the geometric manifold over the neighbourhood of the underlying real distribution.We can naturally preserve the high confident parameters and setting the lowly confident parameters to neutral value.Therefore,this parameter choice criterion is called the confidentinformation-first principle(CIF).We prove that the CIF principle lead to a submanifold that can maximally preserve the expected Fisher-Rao information distance between any distribution and its ?-sphere neighborhood.2.We analyze different types of Boltzmann machines as implementations of CIF,which reveals the essential parts of the target density that the BM can capture in terms of model selection.3.We propose an efficient CIF-based model selection algorithm for BM when the sample is given.Based on CIF,we could use the confidence to decide the priority order of parameters.Furthermore,we could decide whether certain parameter should be preserved or not by using a hypothesis test on the confidence,which can effectively reduce the time complexity of parameter choice.4.We develop a CIF-based method to regularize the network structure of deep neural networks,so as to alleviate the overfitting problem while training.We propose to use the confidences of parameters to measure their importance,and the sub-network that consists of only confident connections is called ConfNet.We also modify the training algorithm for deep ConfNet,where the network is dynamically adjusted while training to balance the model complexity and sample size.

Keywords/Search Tags:

Boltzmann Machine, Information Geometry, Parameter Reduction, Model Selection, Deep Neural Networks

PDF Full Text Request

Related items

1	Mining Invariance Of Boltzmann Machine Via Information Geometry
2	A Method Of Improving Restricted Boltzmann Machines Via Theta Pure Dependency
3	Research On Deep Neural Networks Based On RBM
4	Research On Neural Network-based Acoustic Modeling For Speech Synthesis
5	Research On Deep Neural Networks Based On RBM Or ELM-AE
6	Research On Shape Model Based On Deep Learning And Image Segmentation Application
7	High-dimensional Anomaly Detection Based On Neural Networks Dimensionality Reduction And Support Vector Machine Classification
8	Deep Learning Models And Applications Based On The Restricted Boltzmann Machine
9	Fingerprint Pattern Recognition And Classification Based On Neural Networks
10	Stock Index Futures Trading Based On Deep Learning