Font Size: a A A

Research On Feature Selection Of Ultra-high-dimensional Competitive Risk Data Based On Correlation Rank

Posted on:2021-03-16Degree:MasterType:Thesis
Country:ChinaCandidate:C G LiFull Text:PDF
GTID:2430330605463028Subject:Statistics
Abstract/Summary:PDF Full Text Request
With the rapid advance of technology,ultrahigh-dimensional data,which could be col-lected at a relatively low cost,has appeared in various fields such as genomics,imaging and economics.Ultra-high dimensional data analysis is a hotspot and difficulty in modern statistical research,mainly because the sample size of ultra-high dimensional data is much smaller than the number of variables.In recent years,numerous feature screening schemes have been developed for ultra-high dimensional standard survival data with only one failure event.In fact,there are some complex survival data compared with standard survival data,such as competitive risk data,semi-competitive risk data and so on.In the study of one type of cancer,patients may die from the cancer or from another causes,which is the competitive event of dying from the cancer.In the data analysis of competitive risk,it is wrong to simply classify competitive events as censoring event.Therefore,the ultra-high dimensional data analysis methods for standard survival data cannot be used to analyze the ultra-high dimensional competitive risk data.Nevertheless,existing literatures pay little attention to related investigations for com-peting risks data.For competitive risk data,some scholars proposed a sure independent screening method based on Pearson correlation coefficient to solve this problem.However,the sure independent screening method also has some problems,such as:it cannot find the nonlinear relationship,it is sensitive to outliers and so on.This article develops a new marginal feature screening for ultra-high dimensional time-to-event data to allow for compet-ing risks.The new method mentioned in this paper has the following advantages:firstly,the proposed procedure is model-free;secondly,it has strong robust against heavy-tailed distri-butions and potential outliers for time to the type of failure of interest;thirdly,it is invariant to any monotone transformation of event time of interest.Under rather mild assumptions,it is shown that the newly suggested approach possesses the ranking consistency and sure independence screening properties.Some numerical studies are conducted to evaluate the finite-sample performance of the new method and make a comparison with its competitor.Finally,the new method proposed in this paper will be illustrated through a real example.
Keywords/Search Tags:Consistency in ranking, Feature screening, Model-free, Sure independence screening, Ultra-high dimensional competing risks data
PDF Full Text Request
Related items