Application Of Cross-validation Methods In Kernel Partial Least Squares Model

Posted on:2023-09-02

Degree:Master

Type:Thesis

Country:China

Candidate:Z Y Xu

Full Text:PDF

GTID:2530306836467734

Subject:Applied Statistics

Abstract/Summary:

PDF Full Text Request

In many disciplines such as Biomedicine,Food Science,Geoscience and Analytical Chemistry,massive amounts of data are generated due to the extensive use of modern instruments like nuclear magnetic resonance apparatus and high-throughput spectrometer.There is a large amount of useful information hidden behind these data.Quantitative or qualitative analysis of these data helps to uncover broad and in-depth corresponding scientific conclusions.Quantitative or qualitative analysis of data such as classification or regression is generally called supervised learning in statistical learning.These methods often need to perform two tasks.(a)Choose the best combination of learning algorithms and adjust their hyperparameters,also known as model selection.(b)Provide performance estimates for the final report model.Bootstrap,Jackknife,and cross-validation methods are commonly used to select the optimal model(or hyperparameters)and evaluate the model performance.Each of these methods has its own advantages and disadvantages.For example,there is an optimistic bias in estimating the true error in cross-validation,that is,underestimation of true error.Many scholars have noticed this problem and proposed different correction methods,such as Nested cross-validation,which can provide a relatively good estimate but cost expensive computation.In order to reduce the computation,the Tibshiranis propose a TT method.Because the TT method is simple and intuitive,many scholars pay attention to it,but some scholars think the TT method overestimates the true error.Therefore,this paper proposes two improved methods based on the TT method: the improved method based on one-half TTBias and the improved TT method based on the median.In this paper,the proposed two improved TT methods and TT method are applied to the modeling experiments of partial least squares model and kernel partial least squares model in high-dimensional data.The empirical results of synthetic and real data sets show that the two new methods proposed in this paper not only correct the optimistic bias of the cross-validation estimation,but also avoid the excessive results of the TT method,thus indicating that the proposed method in this paper is improved.

Keywords/Search Tags:

cross-validation, performance estimation, bias correction, partial least squares, kernel partial least squares

PDF Full Text Request

Related items

1	The Theory And Applications Of Partial Least Squares Regression And Spare Partial Least Squares Regression
2	The Theory And Applications Of Partial Least Squares Regression And Kernel Partial Least Squares Regression
3	Partial Least Squares Regression Model And Applications On Education Statistics
4	Application Research Of Based On Different Penalty Function Constraint Partial Least Squares
5	Treatment Of Multiplex Colinear Problems Based On Kernel Partial Least Square Regression
6	Estimation And Application Of Model Uncertainty In Partial Least Squares
7	Study On The Total Least Squares Method Of Partial EIV Model And Its Application
8	Improvement And Application Of Back Propagation Network Based On Partial Least- Squares Algorithm
9	Analysis Of Factors Influencing China’s Large And Medium Cities Housing Prices Based On Partial Least Squares Regression
10	The Study Of Some Problems In Partial Least Squares Regression