| With the explosive growth of various data,the imbalance's trend and the increasing trend of dimension on massive data is more and more obvious,which seriously reduces the accuracy of classification.The original classification algorithm has good classification performance under the premise of balanced data set,but the classification performance will be seriously reduced if data is imbalanced.Therefore,it is an urgent problem to improve the classification performance of imbalanced data.Aiming at the problem of classification bias caused by imbalanced data,the problem of reducing the accuracy of classification and increasing the difficulty of classification caused by high dimension of data,this paper do a lot of study.The main research contents are as follows.:1.An imbalanced data classification model combined with variational autoencoder is proposed.The model learns the distribution characteristics of samples closer to the real data through multiple non-linear feature transformations of the neural network which takes into account the characteristics of minority class with the help of variational auto-encoder.Then,the generator of variational auto-encoder is used to generate samples which are more in line with the original data characteristics to balance the training data set.The model solves the limitations of traditional over-sampling which is difficult to approach the real data and the problem of classification over-fitting.2.A high-dimensional imbalanced data classification model based on improved denoising auto-encoder is proposed.The model introduces a new noise function according to the imbalance,which changes the input data to differentiate the majority samples from the minority samples in the step of adding noise on the original denoising self-encoder.By damaging a few samples through noise layer to get high attention in training process,the model solves the problem of invalidity of feature extraction caused by the imbalance of positive and negative samples and reduces the classification error caused by high dimensionality of data.3.An imbalanced emotional classification model is constructed.The model preprocesses the acquired data using word segmentation,deactivation and training text vectors Firstly.Then,the imbalanced text vectors are processed with VAE oversampling and improved denoising auto-encoder.The experiments results show that the two algorithms effectively improves the performance degradation of the classifer due to the imbalanced data. |