Font Size: a A A

Reproducibility Measurement Based On Adaptive Hybrid Copula And Its Application In High - Throughput Depth Sequencing

Posted on:2014-11-26Degree:MasterType:Thesis
Country:ChinaCandidate:Q ZhangFull Text:PDF
GTID:2208330434972676Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the development of calculating performance and storage capacity of computers, data generated from experiments grows with each passing day. There-fore a serious problem is facing us:how can we process such a explosive growth of experimental data? Generally speaking, distinguishing the real observations from artificial data is vital first step the entire series of data analysis. Measuring the reproducibility of experiment replicates is an effective method for this purpose.Here reproducibility presents the probability that two observations of a ran-dom variable or two replicates of a experiment are consistent with each other. Briefly, if the reproducibility of two observations or replicates is high, we can con-sider that they are confident. Conversely, if the reproducibility is low, there is a high probability that the data is artificial. Thus, once the reproducibilities of data have been estimated, we can use them to distinguish artificial data to enhance the reliability of data.In this paper, we firstly introduce the relative content of Copula theory, demonstrate the definition of mixture Copula and prove some theorems and prop-erties of mixture Copula. Then, basing on mixture copula, we propose our self-adaptive mixture copula method, which has several advantages such as adapting to data automatically and no assumption for data distribution. We employed this method to construct a model to measure reproducibilities of data, which is called SaMiC. Actually, SaMiC features no parameters that need to be tuned and can calculate the reproducibilities in an automatic way.Finally, we use both simulated and real data to test the proposed SaMiC, and compare it with IDR which is another method of measuring reproducibilities. Experimental results generated from IDR and SaMiC indicate that compared with IDR, the SaMiC method can better estimate reproducibility between replicate samples.
Keywords/Search Tags:Machine Learning, Copula, Reproducibility, Bioinfor-matics
PDF Full Text Request
Related items