An information theoretic approach to neural network desig

Posted on:1997-09-19

Degree:Ph.D

Type:Dissertation

University:Stanford University

Candidate:Cunha, Fernando B. L

Full Text:PDF

GTID:1468390014482253

Subject:Electrical engineering

Abstract/Summary:

The traditional design of neural networks follows a two-step training procedure. In the first step, the initial weight values of a network are chosen at random. In the second step, these weight values are modified to make the network's outputs, in response to a set of input patterns, match as closely as possible a set of target output patterns. Despite much success, the traditional design has presented weaknesses revolving around the underlying problem of ill-conditioning, which seems to be intrinsic to neural network training problems.;This dissertation proposes a new network training procedure that adds an intermediate step to the traditional two-step training procedure. In this intermediate step, weight values are modified to condition the network to become maximally sensitive to the input patterns used to train the network. This step is implemented using a performance measure rooted in information theory, and is capable of minimizing ill-conditioning prior to the execution of the final step of training. This step is termed pre-conditioning. Theoretical and experimental analysis of pre-conditioning suggests that the procedure lessens problems with local optima and dramatically reduces network training time.;In addition, this dissertation introduces layered learning, which consists of the individualized training of each layer of a multilayer neural network, and is shown to have a remarkably positive effect on network training time. It also introduces two fundamental pseudo quantities: pseudodeterminant, the determinant of a rectangular matrix; and pseudoentropy, the amount of disorder on the output surface of an arbitrary mapping. Furthermore, it discusses some analytic properties of neural network layers and provides alternative proofs of generalized versions of the Pythagorean Theorem and the Triangle Inequality. Finally, this dissertation proposes a new non-parametric method of probability density function estimation that is based on maximum entropy arguments.

Keywords/Search Tags:

Network, Training, Weight values

Related items

1	Research On The Prototype Of A Body Weight Support Gait Training Robot
2	Research On Communication Application Training Project Based On Tri-networks Integration
3	Research On Self-training Classification Based On The Cut Edge Weight Statistics
4	Research Of Mutual Learning Neural Network Training Method
5	Research And Implementation On Deep Convolutional Neural Network Compression Algorithm
6	Lower Limbs Exoskeleton Rehabilitation Training Robot Weight Reduction System Research
7	Investigation On Spike Time-dependent Plasticity With Weight-dependent Learning Window Based On VCSOA
8	Analysis On The Value Transition Of Southern Weekly
9	Study Of News Values Theory In China
10	Overall Mechanical Structure Design And Movement Analysis Of Lower Limb-walking Training Robot For Rehabilitation