Font Size: a A A

Multilayer perceptron structured initialization and separating mean processing

Posted on:2004-02-29Degree:Ph.DType:Dissertation
University:The University of Texas at ArlingtonCandidate:Delashmit, Walter H., JrFull Text:PDF
GTID:1468390011970162Subject:Engineering
Abstract/Summary:
Multilayer perceptron neural networks have extensive applications in many areas. Significant advances have been made in the training of these networks, but problems still persist due to the chaotic aspects of training. In this dissertation, the problems of (1) network initial types and training problems, (2) prediction of the monotonic performance of the mean square error curve as the network size increases with the addition of hidden units, (3) improving initialization for defined classes of networks, (4) building larger networks based on the results from smaller networks and (5) developing a separating mean processing architecture are addressed.; These problems are addressed with the goal being to improve network performance by ensuring that the mean square error curve is a well-behaved monotonically non-increasing function as the network size increases. Training techniques are developed for (1) randomly, (2) common starting point and (3) dependently initialized networks. The common starting point initialized networks are enhanced via a structured weight initialization.; Bounds are developed to predict the monotonic performance of the error curve as a function of the number of hidden units. These bounds are also used to determine the number of seeds required to ensure a monotonic non-increasing error curve for randomly initialized networks.; Based on the designs of randomly and common starting point initialized networks, dependently initialized networks are designed and evaluated. These networks use the final weights from a well-trained smaller network as the initial weights for a larger network. Results on benchmark data sets are presented to show that dependently initialized networks outperform all of the other networks.; Separating mean architectures were also designed using both a top-down and a bottom-up approach. These new architectures separate the linear and nonlinear aspects of training and can produce superior performance improvements. These networks can also be implemented using the dependently initialized techniques. For the top-down separating mean, a new unique technique is developed to determine similar means of inputs and outputs so they can be removed.; Performance results for all of these techniques are presented using analytical and simulation results on several benchmark data sets.
Keywords/Search Tags:Networks, Separating mean, Common starting point, Training, Performance, Initialization, Error curve, Results
Related items