Font Size: a A A

Research On Extreme Learning Machine Methods For Online Prediction

Posted on:2020-12-30Degree:DoctorType:Dissertation
Country:ChinaCandidate:H Y YuFull Text:PDF
GTID:1368330602955538Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
Online prediction has always been a frontier research task in the field of intelligent information processing.It has an important value in various kinds of engineering applications such as anomaly diagnosis,system control,signal monitoring,and behavioral reasoning.Machine learning establishes nonlinear models in a data-driven way to express data relationship,which can forecast new data effectively.As a representative method of machine learning,extreme learning machine(ELM)is suitable for big data environment which has the characters of volume,variety,velocity,and veracity.This is owing to its theoretical analytic solution and fast convergence speed.In the practical scene,due to the business requirement of real-time sampling and the limitation of computer cache,the data will arrive one by one or chunk by chunk,thus forming a fast-changing data stream.Complex and changeable streaming data demand that the model adjusts its structure automatically according to the prediction results of current chunk,avoiding manual parameter adjustment and model retraining.These requirements present a challenge for the effectiveness and adaptability of prediction algorithm.The ELM builds multilayer neuronal connection by simulating human learning behavior.It represents the mapping between attributes and labels and realizes intelligent information processing with semantic features.In the training stage,the output weight is calculated by M-P generalized inverse,which avoids parameter ergodic processes.The self-adaptive ELM and penalized weighted ELM can solve the complex function approximation problem effectively.They have been successfully applied to the offline prediction.However,with the data accumulation,the data structure will change significantly,leading to the offline prediction methods out of work.In order to adapt the learning task of changing environment,the online prediction model with automatic adjustment of parameters and structure still needs to be further studied and improved.This paper considers different types of labels and various attribute structure.The research focuses on the problems of long-term cumulative data,time-varying unbalanced labels,self-growing feature space and unstructured image sequences.It aims to improve the robustness and adaptation of machine learning in the changing data environment and explore new methods of large-scale data online prediction.The major contributions of this dissertation are as follows:(1)A dynamic ELM algorithm with balanced variance and bias is proposed.This method aims at the model automatic adjustment problem of long-term online prediction.It solves the problem that the model depends on the initial structure and cannot adjust quantitatively.The difficulties that the traditional ELM prediction model depends on the initial structure and cannot be adjusted quantitatively are addressed.This method introduces the measurement parameter about fitting degree.By the way of decomposing sequence error and comparing changeable variance and bias,the quantitative expression of over-fitting and under-fitting is realized.Penalty regression model balances fitting ability and freedom degree,and particle swarm optimization algorithm optimizes the number of hidden layer nodes and regularization parameters,thus forming an automatic update strategy.The prediction model avoids interactive parameter adjustment and ensures the structure suitable for long-term prediction.Experimental results demonstrate that the proposed method can adapt to the shifty trend.Compared with the representative online prediction methods,the proposed method has lower generalization error and higher correlation coefficient in 4 UCI standard data with different attribute dimensions.(2)An integrated dynamic ELM method based on two-stage game theory is proposed.This method aims at the online prediction with imbalanced sequence of multiple sample labels.It solves the problems of imbalance rate change and inaccurate data reconstruction.After employing the data process and model update,the method matches the changes of sample structure automatically.In the data processing stage,the dynamic ELM with game theory is used to generate minority class samples and balance the sample distribution of different classes.Different from the traditional resampling method,the proposed method combines a zero-sum game strategy and principal component analysis threshold to ensure the authenticity of each sample fragment.In the model updating stage,the information entropy is utilized to quantify the overall fitting degree and establish the relationship between weight and loss degree.Meanwhile,an aggregate model of game theory is adopted to calculate the combination weight.These strategies help the algorithm build the steady network architecture in the next chunk.The method avoids the bad adaptation causing by changing the multi-classification to multiple binary classification and improves the fitting effect of fast-changing data stream.Experimental results demonstrate that the proposed method has higher G-mean and F-measure values in 6 multiclass imbalanced UCI standard datasets.It improves the prediction ability of dynamic ELM for the minority class samples.(3)An integrated dynamic ELM method based on quantile estimation is proposed.This method aims at online probability prediction of non-stationary sequences with increasing feature dimensions.It solves the problems of feature dimension increase and confidence interval singleness.For the point prediction,this method defines feature threshold according to the similarity of feature and label.It shows the advantage of online feature selection.Meanwhile,the method establishes ensemble learning model and obtains the optimal parameters through artificial bee colony algorithm.It helps to reduce the randomness of input weight and bias.The filter threshold is adjusted according to the average error and it improves the model compactness.For the confidence interval prediction,the method uses fuzzy inference and two-dimensional kernel density estimation to judge the confidence interval of forecasting value.It will get a smooth probability density expression and break through the limit of the hypothetical condition for error distribution.Typical non-stationary data which represent photovoltaic energy conversion are selected for experiments.Experimental results demonstrate that the proposed method achieves high generalization performance and confidence,and matches the periodicity and volatility of the non-stationary sequence.(4)A multilayer ELM method with object principal trajectory is proposed.This method aims at the online prediction problem of unstructured image sequence.It solves the problem that the model is difficult to represent image feature and associated semantics.This method fully considers the image spatial-temporal property.The frame difference method and k-means clustering analysis achieve pixel-level extraction of different moving objects.Decondary exponential smoothing calculate the principal trajectory of each moving object.It achieved to estimate the movement trends of multiple objects.The multilayer ELM is employed to quantify the shape features.Mapping relationship between historical sequence and current image promotes the new interesting region.It guaranteed the authenticity of the new image.The FISTA method accelerates the convergence speed of parameter optimization and simplifies the solving process of the deep neural network.Typical image sequences which represent the movement of pedestrians and vehicles selected for experiments.Experimental results demonstrate that the proposed method improves the prediction accuracy and image resolution.In addition,it also mines overall semantic features effectively and needs not the model establishment for each pixel,thus improving the prediction efficiency.
Keywords/Search Tags:Online prediction, Extreme learning machine, Imbalanced data, Non-stationary data, Unstructured data
PDF Full Text Request
Related items