Font Size: a A A

On Two Classic Statistical Inference Problems Of Kernel Machines And Survival Analysis

Posted on:2021-04-18Degree:DoctorType:Dissertation
Country:ChinaCandidate:T T LiuFull Text:PDF
GTID:1487306464466384Subject:Statistics
Abstract/Summary:PDF Full Text Request
Kernel machines and survival analysis are two important topics in statistics.In this thesis,we focus on the problems of kernel machine estimation in the presence of missing responses and testing the overall difference of two survival curves.For the problem of kernel machines with missing responses,there exists one method based on the exponential family.However,its exponential assumption can hardly be verified in practice.In addition,the properties of optimization process and convergence are not clear.In this paper,we propose two kernel machines to handle the missing responses problem.The proposed kernel machines can be used for both nonparametric regression and classification.The first kind of kernel machine is called weighted-complete-case kernel machine which are applicable to both missing response and missing covariates data.It is,however,subject to the limit of some missing mechanism assumption limitations.Our second kind of kernel machine is called doubly-robust kernel machine which overcomes the aforemention limit of the first kind.The unbiasedness of the empirical risk can be obtained when either the missing mechanism or the conditional distribution of the response is correctly specified.Theoretical properties,including the oracle inequalities for the excess risk,universal consistency,and learning rates are established.We demonstrate the superiority of the proposed methods to some existing methods by simulation and illustrate their application to a real data set.In order to promote the use of these two kernel machines,we develop an R package KM4 ICD.For the classical two-sample testing problem in survival analysis,there are some well established methods such as distance based methods and distribution based methods.However,the verification of the assumption is often overlooked and the tests are not that powerful as expected.We propose an intuitive test statistic,namely,the area between two survival curves.We emphasize that the proposed method is particularly useful when there exists crossing of two survival curves.The asymptotic distribution of the test statistic as well as of its bootstrap counterpart are derived under the null hypothesis,We show the consistency of the test under the general alternatives.Our method has two advantages over the other existing method in that(i)it allows ties in the data(ii)it allows different independent censoring mechanisms.We demonstrate the finite sample superiority of the proposed test over some popular methods in a simulation study and illustrate its application by a real-data example.
Keywords/Search Tags:kernel machines, missing responses, augmented inverse probability weighted estimator, doubly-robust estimator, university consistency, learning rate, survival analysis, log-rank test, nonproportional hazard, area between curves, bootstrap test
PDF Full Text Request
Related items