Font Size: a A A

A Study On Model Optimization Of Neural Networks With Random Weights

Posted on:2020-12-28Degree:DoctorType:Dissertation
Country:ChinaCandidate:W P CaoFull Text:PDF
GTID:1368330599954826Subject:Parallel information processing
Abstract/Summary:PDF Full Text Request
NNRW,Neural Networks with Random Weights,is a special type of feedforward neural network with non-iterative learning mechanism.In the training process of NNRW,the input weights and hidden biases are generated randomly and remain unchanged throughout the training process.The output weights are calculated by solving a least square problem.This learning mechanism makes the NNRW has the advantages of fast learning and low requirements for hardware computing resources.In recent years,NNRW has attracted extensive attention.According to the difference of the network structure and random degree,the current mainstream research routes fall into two categories: Random Vector Functional Link networks(RVFL)and Extreme Learning Machines(ELM).The algorithms and applications based on RVFL and ELM have shown great potential in many fields,such as the prediction accuracy of the NNRW model on some image classification problems is higher than that of the traditional deep learning model.However,at present,most of the related algorithms still do not have appropriate methods to deal with the problems of network initialization,data preprocessing,and the selection of hidden layer neurons when building models,which result in a long modeling time and the models' performance cannot be guaranteed.Online learning is one of the most suitable applications of the NNRW.However,there are still no good solutions to the problems of the large-scale data stream and unstable data distribution.It is difficult to balance the updating efficiency and the performance of the online model,which seriously affects its effectiveness and practicability in real applications.In addition,in some real scenarios,the number of labeled samples is very limited,which is also a serious challenge for the application of the NNRW.If these problems cannot be solved well,it will cause a lack of reliability guarantee of the NNRW model and even cause irreparable damage in the practical application.The above problems can be divided into two categories: the model optimization problem and the application algorithm optimization problem.In this paper,several key problems in the model optimization and application of NNRW are studied.Specifically,in the model optimization of NNRW:(1)The research on the NNRW initialization is relatively few and narrow so far,and the relationship between the network initialization and the characteristics of the training data is not considered in the existing research results.The problem of the network initialization is very important for the model optimization of the NNRW.If the network is initialized inappropriately,the convergence rate of the model may become very slow and the prediction accuracy of the model may become very low.(2)In the research of the relationship between the rank of input matrix and the performance of the NNRW model,the current research results are relatively scarce and lack of necessary theoretical support.The rank of input matrix has a significant impact on NNRW model performance.Although there are some related research results based on the ELM,the existing results lack the necessary theoretical explanation.In addition,there are few studies based on RVFL.(3)The number of hidden layer neurons has a direct impact on model performance.It is difficult to ensure the optimization of the network structure by empirical methods.In recent years,researchers have proposed several algorithms for automatically selecting the number of hidden layer neurons.However,the existing algorithms still have many limitations such as easy to cause many redundant neurons in the network,which will have a negative impact on the generalization ability of the model.In the application of NNRW:(4)Facing the difficult problems in the online learning scenario such as the large-scale data stream,the unstable data distribution,and the complex data characteristics,the NNRW based solutions are scarce and the flexibility of existing algorithms is also low.The large-scale data stream for the model updating is likely to cause contradictions between the updating efficiency and model performance.The instability of the data distribution in the data stream is likely to cause the model cannot adapt to the new environment quickly,resulting in a significant decline in the prediction ability of the model.For cases where the data features are very complex such as image processing problems,the existing online learning models with a single hidden layer cannot work well,while the deep online learning model is relatively few.(5)In the case that the number of labeled training data is very limited,the traditional supervised NNRW algorithms cannot work well.Some researchers have proposed to design semi-supervised learning algorithms with the fuzzy theory technique and ELM algorithm to use both the labeled data and the unlabeled data.However,the relevant research is still in its infancy,and the existing algorithms are still limited in the type of network,and the reasonability of the algorithm design is also lacking necessary analysis.In this paper,a series of studies have been carried out on the above five problems,including:For problem 1,we study the impact of initializing the RVFL network with different probability distributions on its model performance,which can provide some useful guidelines for researchers to initialize the RVFL network.In addition,we study the relationship between the initialization of the NNRW(including both ELM and RVFL)and the model performance from the perspective of the meta-features of datasets,and reveal the relationship between the intrinsic characteristics of datasets and the initialization of the NNRW.For problem 2,the relationship between the rank of the input matrix and the performance of the RVFL model is studied in this paper,and the research results can help researchers do better data preprocessing.we also study the type of activation functions and the number of hidden layer neurons on the above relationship.In addition,we propose a new concept,that is,DDMID(Dispersion Degree of Matrix Information Distribution),to explain the above experimental results theoretically,which fills in the theoretical deficiencies in the current related research results.For problem 3,we optimize the bidirectional extreme learning machine(B-ELM),one of the best hidden layer neuron number selection algorithms,by using the enhanced random search technique,and propose an enhanced bidirectional extreme learning machine algorithm named EB-ELM.Compared with the B-ELM,EB-ELM has a faster convergence rate and better model performance.For example,on the regression dataset Machine CPU,the prediction error of the model trained by EB-ELM is 47.74% lower than that of B-ELM.Furthermore,we also propose a random orthogonal projection based enhanced bidirectional extreme learning machine algorithm named OEB-ELM.Compared with the EB-ELM,OEB-ELM further reduces the complexity of the network and improves the generalization ability of the model.For example,on the regression dataset Red Wine,OEB-ELM further reduces the prediction error of the model by 5.92% compared with the EB-ELM.Both of the two algorithms proposed in this paper can automatically select the appropriate number of hidden layer neurons for ELM network,which has important application value.For problem 4,firstly,in dealing with the problem of the large-scale data stream in the online learning scenario,we propose to use the fuzziness information of instances to filter the data stream and only apply high-quality instances to update the online learning model.In this way,the size of new instances in the data stream is reduced and the updating rate of the online learning model can be speeded up.Moreover,high-quality instances can ensure that the model has high prediction performance after the updating.For example,the prediction accuracy of the FOS-ELM algorithm proposed in this study is 11.40% higher than that of the OS-ELM(Online Sequential ELM)and 17.72% higher than that of the TOS-ELM(Timeliness OS-ELM)on the classification dataset Page.Secondly,facing the unstable statistical characteristics of the instances in the data stream,we design a novel dynamic adjustment strategy for the forgetting factor and propose an ELM based online sequential learning algorithm with dynamic forgetting factor(DOS-ELM).DOS-ELM can adjust the relative importance between the new data and historical data in time according to the performance change of the model so that the model can adapt to the new environment quickly.For example,on the dataset Hyperplane,a popular data stream problem with clear non-stationary properties,the prediction accuracy of DOS-ELM model is 19.35% higher than that of OS-ELM model and 11.80% higher than that of TOS-ELM model.On the regression problem Auto MPG,the prediction error of DOS-ELM model is 53.10% lower than that of OS-ELM model,83.91% lower than that of TOS-ELM model,and 92.95% lower than that of WOS-ELM model(a modified OS-ELM).And the corresponding testing standard deviation of DOS-ELM model is 72.31% lower than that of OS-ELM model,83.04% lower than that of TOS-ELM model,and 94.98% lower than that of WOS-ELM model.In addition,we extend the DOS-ELM to ML-DOS-ELM,a deep online learning model with multi-hidden layer network structure.ML-DOS-ELM has better feature extraction ability than the original algorithm and maintains the advantages of the original algorithm.In the image classification problem named UMIST,ML-DOS-ELM with three hidden layers has a 3.59% improvement in prediction accuracy than DOS-ELM.For problem 5,we propose a novel semi-supervised learning algorithm named fuzziness based random vector functional link network algorithm(F-RVFL)by combining the fuzziness of unlabeled data with RVFL.F-RVFL can make full use of the limited labeled data and massive unlabeled data,so it can deal with the problem of lack of sufficient labeled data in some real scenarios.The reasonability of the F-RFVL algorithm is also analyzed in the study.In addition,we use F-RVFL to solve a real liver disease diagnosis problem.In summary,this paper covers five key issues in the model optimization and application of NNRW.The research on the problems 1-3 provides some useful conclusions and methods for the model optimization of NNRW.The research on the problems 4 and 5 provides corresponding solutions for the four difficult problems in the practical scenario.The main innovations of this paper can be summarized as follows:1.The influence of using different distributions to initialize RVFL on its model performance is studied in this paper for the first time.In addition,it is the first time to study the relationship between the initialization of NNRW and the performance of its model from the perspective of meta features of a dataset,which reveals the relationship between the intrinsic characteristics of data and network initialization.2.The relationship between the rank of the input matrix and the performance of RVFL model is studied in this paper for the first time and we propose the concept of DDMID to explain it theoretically.3.Two novel algorithms named EB-ELM and OEB-ELM are proposed in this paper,which can determine the number of hidden layer nodes automatically.4.The proposed four algorithms,that is,FOS-ELM,DOS-ELM,ML-DOS-ELM,and F-RVFL,can effectively deal with the problems of the large-scale updating data stream,unstable data distribution,and complex data features in the online learning scenario,and the limitation of labeled data in the practical application scenario,respectively.
Keywords/Search Tags:Neural Networks with Random Weights, Extreme Learning Machine, Random Vector Functional Link Networks, Model Optimization
PDF Full Text Request
Related items