Font Size: a A A

Support Vector Regression Analysis Based On Complex Censored Data

Posted on:2022-05-09Degree:MasterType:Thesis
Country:ChinaCandidate:L ZhangFull Text:PDF
GTID:2518306482495954Subject:Statistics
Abstract/Summary:PDF Full Text Request
Machine learning is an artificial intelligence science that uses computer programs to simulate human learning behavior,so as to optimize the calculation performance of algorithms.It includes probability theory,statistics theory and complex algorithm application.As an important algorithm in machine learning,support vector machine(SVM)was proposed by Vapnik et al in the 1990 s.It has a good performance in dealing with classification and regression problems with large sample size,and has been widely used in biomedical research,pedagogy application,industrial product production and other aspects.The development of the support vector machine(SVM)for all kinds of disease diagnosis provides a new data processing way,but the medical data of disease due to loss of follow-up,the patient often death cause is censored,the censored data not directly applied to the support vector machine(SVM),therefore the applications of all kinds of deleting data loss to support vector machine(SVM)is significant.This paper first describes the development and research status of support vector machine models under censored data,important censored data types,general support vector regression models,and cross-validation methods to verify the efficiency of the models.Spe-ifically,it can be divided into the following two parts:In the first part of this paper,weighted least squares support vector regression analysis is carried out for interval censored data,and the interpolation weighted least squares support vector machine regression algorithm is systematically proposed for interval censored data.Firstly,the interval censored Data is interpolated by using the PMDA algorithm(Poor Man's Data Augumentation)and the midpoint substitution method,and then applied to the right-censored Data weighted least squares support vector regression model.Choose linear kernel functions and radial basis kernel function to map data in a low dimensional space to find optimal decision plane in high dimensional feature space,prediction accuracy is verified by leave a cross-validation method,aimed at the model algorithm of mentioned a lot of simulation studies,to verify the effectiveness of the proposed model and the new algorithm,model and the algorithm is applied to the concrete in the actual data analysis,Good results have been obtained.In the second part of this paper,the least squares support vector regression(LR-IPWSVM)based on the inverse probability weighted left truncated and right censored data was proposed.In order to compare the effect of this method,the left truncated part was deleted in the analysis method,that is,the least squares support vector regression(R-IPWSVM)based on the inverse probability weighted right censored data.The results of LR-IPWSVM and R-IPWSVM are compared.Meanwhile,the linear kernel function and radial basis kernel function are applied in the model to find the optimal decision plane in the feature space for both SVM and LR-IPWSVM methods,and the model effect is verified by the left-one cross-validation method respectively,which proves that the effect of LR-IPWSVM method is better.In the case analysis,the LR-IPWSVM method also has a good performance in the case analysis when the model algorithm is applied to the study of the data of Stanford heart transplant project.
Keywords/Search Tags:Interval-censored data, Left-truncated and Right censored data, Support Vector Regression, Poor Man's Data Augumentation, Multiple Interpolation
PDF Full Text Request
Related items