Font Size: a A A

Research On Biological Big Data Analysis Algorithm Based On Meta Analysis

Posted on:2022-10-23Degree:MasterType:Thesis
Country:ChinaCandidate:C Y TongFull Text:PDF
GTID:2510306320968329Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The field of life medicine has been developing rapidly and comprehensively since the beginning of this century.At present,bioinformatics combined with mathematical statistics has made great contributions to human health.In the age of biological big data,the genome-wide Association Study(GWAS)emerged.In this approach,single nucleotide polymorphism(SNP)is used as molecular genetic markers,and the ultimate goal is to explore the mutant genes related to genetic diseases and important physiological traits of the organism,and to conduct correlation analysis and genetic research in the whole human genome.However,due to the differences in research methods,sample sizes and other factors,multiple GWAS for the same research problem often have different research results.An effective and comprehensive solution is to use Meta analysis to conduct quantitative and comprehensive evaluation of these results,so as to identify significant variations more accurately.With Meta analysis application in clinical medicine and genetic research is more and more widely,related theory algorithm has also been a lot of work,but for need checked mass multiple of tens of thousands of genes at the same time,most of these methods are insist on distribution of conventional statistical hypothesis test statistics zero,this would lead to incorrect significance test results.Based on this,this paper focuses on the analysis algorithm of biological big data based on Meta analysis,and the main research results are as follows:1.In the aspect of theory,the further study of the existing Meta analysis method,and in view of the traditional methods using conventional statistical hypothesis may lead to multiple hypothesis testing of a large number of false positive results of this defect,put forward MRSF algorithm based on random symbols to flip,and contrast on triglyceride gene data set other four P values combination methods,the experimental results show that the The MRSF approach overall has a better ability to identify valid variants.2.On the application side,MRSF method was used to find significant variations associated with blood glucose characteristics.The genetic data of three studies on glucose homeostasis GWAS were collected,preprocessed,and meta-analyzed by MRSF.In addition to identifying 3 previously reported results,the experiment also identified 6previously unreported SNPs related to glucose homeostasis,such as rs494874.
Keywords/Search Tags:Biological big data, GWAS, SNP, Meta-analysis, p-value
PDF Full Text Request
Related items