Font Size: a A A

Study Of Differential Expression Genes Detection Algorithm Based On RNA-Seq Data

Posted on:2018-05-01Degree:MasterType:Thesis
Country:ChinaCandidate:N ZhangFull Text:PDF
GTID:2310330512977201Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
RNA-Seq is a general experimental method of modern biology field,which is mainly used to screen out the expression of genes,that is,to detect different genes with differential expression.Differential expression analysis study the differences of the same type of gene in different developmental stages or different physiological environments.It is not only has statistically significant but also has biological significance.It provides important theoretical basis for understanding the nature of life process and study the regulation of gene expression.The procedure of detecting RNA-Seq data with differential expressed genes is conducted.The main content includes:(1)A modified normalization algorithm based on Trimmed Mean of M-Values(TMM)and geometric mean is presented with the coefficient of variance and median absolute deviation.Standardized data were obtained using TMM and geometric mean algorithm,respectively.The optimal coefficients of variation are calculated according to each gene.The median absolute deviation is utilized to standardize the data.Experiments show that this algorithm not only can eliminate the error of sequencing in the experiment process,but also adjust all sequencing samples in order to have the same level,to obtain the smaller error and the higher precision in contrast to two existing algorithms.(2)An improved svaseq algorithm is proposed to remove the batch effect.Firstly,the regular logarithmic transformation model and logarithmic transformation model are constructed according to the correlation significance parameters.Then,the residual matrix of the data is obtained by the model.The matrix is decomposed by factor,and subset of the decomposition matrix is used to estimate the surogate variable.The experimental results show that this algorithm can better eliminate the batch effect,and also improve differential expression result.(3)An improved DESeq algorithm is given for detecting differential expressed genes.Assuming that the data obey the negative binomial distribution model,the mean and variance,according to the scale factor,is calculated in the first place,to obtain the discrete parameters.Moreover exact test is used for differential expression analysis.The experimental results demonstrate that the improved algorithm can better detect differential expressed genes,and the accuracy are improved up to 6.9%.
Keywords/Search Tags:RNA-Seq Data, Normalization Methods, Batch Effects, Differential Expression
PDF Full Text Request
Related items