Font Size: a A A

Application Of Lp Norm Regularized Regressions In Classification Problems

Posted on:2019-12-03Degree:DoctorType:Dissertation
Country:ChinaCandidate:G Z HuangFull Text:PDF
GTID:1488305702488324Subject:Systems Engineering
Abstract/Summary:PDF Full Text Request
Classification is an important research area in the field of machine learning.In a real-world application,the classification problems encountered are complex and changeable.For a special classification problem,the design of the classifier needs to take account of both universality and specialty.In the field of machine learning,the design of a classifier can be treated as a multivariate fitting regression problem.The modeling idea of Lp norm regularized regressions is simple and profound.The regularized regressions based on Lp norm can flexibly handle multivariate fitting regression problems.There are links and also differences between classification and regression.Considering the good qualities of the Lp norm regularized regressions and the requirements of actual classification tasks,this dissertation study the application of Lp norm regularized regressions in classification problems.Our research work in this dissertation is as follows:1.The cancer detection based on tumor-educated platelet(TEP)data must depend on the classifier.Considering the cancer diagnosis is a clinical problem,providing the reliability of the classification result is helpful in determining the follow-up treatment.Therefore,we presented a new soft decision model,which combines multiple fitting regression and Bayesian decision.The multivariable fitting regression is a variant of L2 norm regularized regression.We use multivariate fitting regression to map the original multi-dimensional features into a one-dimensional feature.In the mapped feature space,the distribution of each class is fitted by Gaussian probability density function.We combine the estimated Gaussian probability density functions and Bayesian decision theory to obtain a classifier with posterior probability output.We use support vector machine(SVM)and probabilistic SVM(PSVM)to demonstrate the rationality of the proposed method by simulated data and real TEP data.Our results indicate that the proposed method has higher generalization ability than SVM and PSVM for limited,imbalanced,and noisy data.2.Considering the TEP data is high dimensional,small sample,and collinear,feature selection is important in improving the generalization ability of the classifier.Elastic net is a kind of regularized regression using the convex combination of L1 and L2 norm as penalty term,and it can achieve the goal of feature selection.Considering the advantages of the elastic net in dealing with high dimension,small sample and collinear data,it is suitable for feature selection in the TEP data.Therefore,we present an idea of using the elastic net to enhance the performance of the classifier.The idea is implemented into a new support vector machine based classifiers,named as CPSVM:CPSVM is a multi-classification method based on class-specific features.We test the proposed method with simulation data and TEP data.The results show that the CPSVM can achieve better classification results than the traditional classification method combined with global feature selection methods.CPS'VM is suited for the classification problem with a multi-class and multi-feature condition.3.The classifiers in this dissertation used to detect cancer with TEP data are based on binary classifiers.In the machine learning field,the binary classifiers and one-class classifiers are two basic types of the classifier.Based on the characteristic of continuous coginition and multivariate regression,we presented a new framework for the one-class classifier design,named as one-class multiple regression(OC-MR).We also implement OC-MR with least squares support vector machine(LSSVM),which is a kind of L2 norm regularized regression and the corresponding one-class classifier is named as one-class LSSVM(OC-LSSVM).The performance of OC-LSSVM is evaluated by some classical one-class classifiers under various simulated and real data.The experiment results show that OC-LSSVM has achieved the best performance in most of the data due to its good robustness to the parameter of the Gaussian kernel.In conclusion,we extend Lp norm regularized regressions to deal with some problems in classification in this dissertation.In this way,the basic problems of machine learning,such as binary classification problem,one-class classification problem,and feature selection problem can be solved uniformly under the framework of Lp norm regularized regressions.In this study,we use Lp norm regularized regressions to solve some classification problems and achieve good results,which also provides some new ideas for classifier design.
Keywords/Search Tags:Classification, Regression, Lp Norm Regularization, Cancer Detection, Support Vector Machine, Feature Selection, One-class Classifier
PDF Full Text Request
Related items