Font Size: a A A

Study On Bayesian Inference Techniques In Regression Analysis

Posted on:2016-08-02Degree:MasterType:Thesis
Country:ChinaCandidate:H L LiFull Text:PDF
GTID:2308330464464999Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Recent a few years one of the most active direction in regression analysis field has been the development of practical Bayesian methods for challenging learning problems. This thesis, with the maximum wind speed of typhoon prediction problem as the application background, mainly describes three methods under the framework of Bayesian:the Gaussian process, Relevance Vector Machines and Probabilistic principal component analysis to deal with three problems:the multi input variables, the correlation between the data sample and the abnormal data. Four aspects of research are made in the following parts:(1) Before using the Gaussian process to establish the regression model, it is noticed that the number of the input variables is very large and the output variable is a nonlinear function of the input variables. Firstly, the mutual information between each input variable and the output variable is calculated, because the mutual information indirectly reflects the correlation between the input and output variables. Determining a threshold on the mutual information with a t-test, the input variables whose mutual information with the output variable is less than the threshold are discarded. After the optimal model input variables are selected, a Gaussian process regression model is fitted to the selected sample set. At the same time the hyper-parameters in the covariance function are determined under the Bayesian nonparametric framework. The simulation results show that the Gaussian process regression model has met the predetermined requirements on absolute error, and it has great practical values.(2) Before using the sparse Bayesian framework to establish the regression model, in view of the fact that the sample data has great difference in different regions, a sparse Bayesian mixtures model based on fuzzy c-means clustering is presented. Because of the disadvantages of the initial clustering center in fuzzy c-means clustering algorithm, an improved clustering analysis method based on simulated annealing and genetic algorithm is proposed. The sub-models are trained by sparse Bayesian regression technique with corresponding sub-class samples efficiently. The experimental results show that the sparse Bayesian mixtures regression model has met the predetermined requirements on absolute error, and it is superior to other regression models.(3) Confronted with the defects on the initial clustering center and the number of subjectivity in the previous work, this part uses the affinity propagation clustering algorithm quickly and objectively to cluster the training sample. The affinity propagation clustering algorithm doesn’t require setting the initial clustering centers and the number of clusters artificially. The sub-models are still trained by sparse Bayesian regression, and the superior sparsity reduces the complexity of the models. The experimental results show that this mixtures model has better prediction effect than the previous work.(4) In the prediction model for the maximum wind speed of typhoon, the number of the input variables is very large, so the situation of missing data is easy to happen. However, regression analysis can’t deal with this situation. This part proposes a method to predict missing data based on probabilistic principal component analysis (PPCA), which treats the abnormal data and the predictive variables as missing variables. The experimental result shows that this method is more flexible than the regression analysis, and it is more accurate.
Keywords/Search Tags:Bayesian method, Regression analysis, Gaussian process, Relevance vector machine, Probabilistic principal component analysis
PDF Full Text Request
Related items