Font Size: a A A

A Class Of Neural Network Training Methods With Their Applications

Posted on:2022-03-30Degree:MasterType:Thesis
Country:ChinaCandidate:L M ChenFull Text:PDF
GTID:2518306491985659Subject:Engineering and Computer Technology
Abstract/Summary:PDF Full Text Request
The training of neural networks can be considered as a high-dimensional nonconvex optimization problem that requires generalization.From the optimization perspective,the training algorithm's convergence rate and application range are the focus.From the perspective of generalization,the neural network's performance on the test set is crucial.This dissertation proposes a class of neural network training methods from these two perspectives and applies these methods to breast cancer diagnosis/prognosis,flatfoot diagnosis,and image classification tasks.The following three works are contained in this dissertation.(1)To improve the generalization performance of the Bernstein polynomial neural network(BPoly NN),an ordered ridge regression method is proposed in this dissertation.Among the features extracted by BPoly NN with ordered ridge regression(BPoly NN-O),low-frequency components dominate.This dissertation verifies the effectiveness of the ordered ridge regression method in different dimensions through simulations.In addition,BPoly NN-O is applied to the diagnosis/prognosis of breast cancer.The corresponding experiment results show that the generalization performance of BPoly NN-O is better than the original Bpoly NN and the commonly used machine learning models participating in the comparison.(2)For improving the training precision of neural networks,a multiple-pseudoinverse method is proposed.Unlike neural networks based on pseudo-inverse training such as BPoly NN,the multiple-pseudo-inverse method trains all the parameters in the neural network iteratively,and only a few iterations are required to complete the training.Simulations are provided to verify the effectiveness of the multiple-pseudo-inverse method.Based on the multiple-pseudo-inverse method,a multiple-pseudo-inverse neural network(MPNN)is constructed.On this basis,this dissertation applies MPNN to the diagnosis of the flat foot and conducts comparison experiments between MPNN and other commonly used neural networks.Experiment results show that MPNN has an excellent performance in this flatfoot diagnosis task.(3)To improve the generalization performance of deep neural networks,this dissertation proposes a method to deform the loss surface and then induce the optimizer to skip sharp minimum points.In addition,by defining deformation mapping in a general sense,it is possible to improve optimization performance from multiple perspectives.This dissertation also designs a vertical deformation mapping(VDM)to promote generalization and theoretically proves that it filters out sharp minimum points in both low-dimensional and high-dimensional situations.Based on the simulation in the two-dimensional case,the observed Hessian matrix eigenvalues show that the VDM induces the optimizer to skip sharp minimum points.In addition,this dissertation also verifies that directly increasing the learning rate cannot replace the proposed method.This dissertation embeds the VDM into the convolutional neural network and conducts visualization experiments of the loss surface.The experiment results show that the network using VDM converges to a flatter region.After applying VDM to image classification tasks,comparative experiments on CIFAR and Image Net show that VDM improves the generalization performance of various convolutional neural networks in image classification.
Keywords/Search Tags:Generalization, machine learning, sharp minima, neural networks, pseudoinverse matrices
PDF Full Text Request
Related items