A Preprocessing And Filling Algorithm For Incomplete High-Dimensional Data

Posted on:2018-09-19

Degree:Master

Type:Thesis

Country:China

Candidate:H J Sun

Full Text:PDF

GTID:2428330596454766

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

Data imputation is the process of interpolating missing values in a dataset.The existing imputation algorithm has problems such as high time complexity,poor accuracy and low robustness,and the research framework is incomplete.In order to solve these problems,this topic use a complete data imputation model,it's tart from data noise reduction to the normalization process and then to data dimension reduction processing.Finally achieved a more excellent imputation effect.In this topic,discrete wavelet transform is used to denoise the high-dimensional datasets.The traditional discrete wavelet transform method is often traverse the way to choose wavelet basis function,and it is not suitable for high-dimensional datasets,will easily lead to dimensionality of disaster problems.After taking into account the characteristics of high-dimensional data,a wavelet basis function selection method based on random sampling is proposed.Experiments show that the method achieves a balance between computational efficiency and noise reduction effects.Traditional data normalization methods often require the maximum,minimum,or average of the data set,and the maximum,minimum,or average must be recalculated when new data is added,Resulting in a lot of redundancy calculations.For the characteristics of high-dimensional data,this topic proposes a new normalized exponential function method to improve the efficiency of data normalization and When new data added,it will not cause repeated calculations.In this topic,we use the group intelligence optimization algorithm to reduce dimension of high-dimensional datasets.According to the characteristics of high dimension data dimension and variable data characteristics,we choose the bird mating optimizer algorithm(BMO),Because BMO algorithm has the idea of grouping iteration,it is possible to adjust the ratio of different groups according to the characteristics of adjusting high-dimensional data to achieve better noise reduction effect.In this thesis,two improvements are proposed to solve the problem of BMO algorithm.First,the parameter adaptive mechanism is introduced,which makes the algorithm adjust the algorithm parameters in real time according to the characteristics of the data set and the different periods of the algorithm iteration.The second is to combine the simulated annealing algorithm with the BMO algorithm to avoid the premature problem of the algorithm.An adaptive simulated annealing BMO algorithm(SABMO)is proposed.Experiments show that the SABMO algorithm has a better effect on the dimensionality reduction of high dimensional datasets.In this topic,the SABMO algorithm is used to optimize the weights and thresholds of neural network training.The SABMO-NN imputation model is proposed.However,this model is a static model that does not change the weights and thresholds in the application phase.After a long time,the data set may have a small amount of offset,resulting in a larger prediction error,and need to retrain this model.Aiming at the problems existing in the SABMO-NN imputation model,an improvement based on the feedback correction mechanism is proposed.Experiments show that the improved SABMO-NN imputation model has better imputation accuracy.

Keywords/Search Tags:

High-dimensional data imputation, wavelet threshold denoising, adaptive simulated annealing bird matching algorithm, neural network, feedback calibration mechanism

PDF Full Text Request

Related items

1	Research On Wavelet Denoising Method Based On Threshold Function And Threshold
2	The Multi-dimensional Double-entropy Threshold Segmentation Based On Parallel Genetic Simulated Annealing Algorithm
3	Research On Ellipsometric Measurement Of Thin Film Based On Adaptive Genetic Simulated Annealing Algorithm
4	Application Of Improved Bird Group Algorithm In Image Segmentation
5	Research Of BM3D Image Denoising Algorithm With Adaptive Distance Hard-threshold
6	Research Of Fuzzy Neural Network Controller Optimized Algorithm Based On Genetic Simulated Annealing Algorithm
7	Research On Wave-front Distortion Calibration Based On The Improved Simulated Annealing Algorithm
8	Improved Genetic Simulated Annealing Algorithm Based On BP Neural Networks And Application In Recognition Of GIS Partial Discharge
9	The Improvement Of A Hybrid Genetic Simulated Annealing Algorithm For Three Dimensional Bin-packing Problems
10	Research On BP Neural Network Learning Based On Particle Swarm Optimization And Simulated Annealing Algorithm