In practical research,there are often various errors in the data we obtain.The cause of these errors may be data recording errors or improper detection methods.These data are noise data.Considering that it is impossible to know the prior distribution of noise in many cases,this paper constructs a robust Logistic regression model based on the mixed distribution hypothesis of noise and applies it to practical cases.The Logistic regression model based on mixed of Gaussian distribution(MoG-LR)considers the characteristics that the mixed of Gaussian distribution can approximate any continuous distribution,and reduces the influence of the inconsistency between the noise distribution and the real distribution on the model performance.The experimental results show that the performance of MoG-LR model is very good whether it is with noise samples or without noise samples.Logistic regression model based on global adaptive adjustment(GAGA-LR)considers that practical problems may not necessarily meet the false timing of linear models,and uses more features to fit the overfitting problem caused by the function.Add a penalty term to the regression coefficient of the Logistic regression model,and use a global adaptive generation adjustment algorithm to estimate it.The MoG-LR model based on global adaptive generation adjustment(GAGA-MoGLR)combines the mixed of Gaussian distribution hypothesis of noise and the prior distribution hypothesis theory of regression coefficient.When the error distribution is unknown,the error distribution hypothesis based on mixed distribution is considered,and the real complex distribution is learned adaptively in the learning of regression model,and a more suitable robust regression model is constructed.At the same time,the overfitting problem which is easily caused by the nonlinear regression model is considered,and the regular term related to the regression coefficient is added to the optimization function when solving the parameters of the mixed distribution model.The experimental results on the data set show that the GAGA-MoG-LR model is the best among the three models for predicting the classification problem of noisy data. |