Font Size: a A A

Transcript Expression Calculation For RNA-Seq Data

Posted on:2015-01-21Degree:MasterType:Thesis
Country:ChinaCandidate:M LiFull Text:PDF
GTID:2298330422980967Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Alternative splicing is a widespread phenomenon for eukaryotes, research found that someabnormal fluctuations in the process of transcription often associated with diseases. Therefore, theresearch of transcription fluctuations in recent years becomes a hot research topic, and the analysis ofgene and isoform’s expression level provides a feasible way to reveal changes in the alternativesplicing study.In recent years, deep sequencing-based RNA-Seq technology has been used on transcriptomeresearch widely. In the experiment several millions of read data can be produced so that theseisoforms’s expression of genes can be calculated. The expression levels can be passed to subsequentanalyzes to look for differential expression, clustering and other research.There is multi-source map reading section reads the reference sequence segment and non-uniformdistribution problems in RNA-Seq experimental data. We propose a new method, LDASeq, tocalculate the expression of the transcriptome based on LDA(latent dirichlet allocation). In LDASeqmodel we introduce “probe” with fixed length to break up the long reference sequence; andcorrespondence the read in one channel with a document to take full advantage the condition,multi-channels.To validate the performance of LDASeq, we applied LDASeq to three single-end dataset and onepaired-end dataset, and compared the performance of LDASeq with other major models RSEM andCufflinks. The results show that LDASeq obtains more accurate gene and isoform expression thanCufflinks and RSEM.
Keywords/Search Tags:RNA-Seq, isoform expression, gene expression, probabilistic model, LDA
PDF Full Text Request
Related items