Font Size: a A A

The Non-parametric Estmation Of False Discovery Rate And Its Application

Posted on:2015-05-18Degree:MasterType:Thesis
Country:ChinaCandidate:B LiFull Text:PDF
GTID:2180330422490728Subject:Probability theory and mathematical statistics
Abstract/Summary:PDF Full Text Request
Complex data always appear in the stock market, gene sequences, economicand other fields, which mainly show the characteristic of dependent, nonlinear, highdimension and incomplete observations. In order to solve the problem of huge datacollection, the theories, methods and techniques of data mining are proposed. Whilehow to examine the high-dimensional statistical inference problem, such as thesignificant difference of expression levels in thousands of genes, the estimation offalse discovery rate provide an effective solution.This paper mainly investigate the test method based on the false discoveryrate of various parametric model and non-parametric model, which is divided intofour chapters. Firstly, this paper describe the definition of the false discovery rateunder the background of multiple hypotheses testing, propose using the P-value totest the hypothesis testing, and discuss the controlling method of the false discoveryrate while the hypotheses testing is independent or dependent. When we investigatethe controlling method of the false discovery rate and study the multiple hypothesistesting problem, we find that the central problem is how to estimate the number oftrue null hypothesis, so this paper use the empirical Bayes estimation to estimate itsvalue. Investigating the estimation of true null hypothesis in the mixing parametricmodel and non-parametric model is the core of this dissertation. Aiming at themixed normal distribution model and Beta mixture distribution model, this paperuse the method of moment estimation and least squares estimation method based onthe P-value to estimate its value. On studying the non-parametric mixture model,the paper introduce the least square estimation method, Beta distribution fittingmodel method and the Bernstein polynomial fitting model method. Finally, thepaper conduct the simulation research based on a group of patients with breastcancer gene data by Hedenfalk, and find that the false discovery rate is able toprovide a suitable error control targets for the multiple hypothesis testing ofmicroarray data.
Keywords/Search Tags:false discovery rate, multiple hypotheses testing, P-value, microarraydata, non-parametric estimation
PDF Full Text Request
Related items