Font Size: a A A

Extraction Pipeline Of Poly(A) Sites On Genome-wide Level And Its Application In Animals

Posted on:2019-12-16Degree:MasterType:Thesis
Country:ChinaCandidate:Y Z ZhangFull Text:PDF
GTID:2370330545983729Subject:Systems Engineering
Abstract/Summary:PDF Full Text Request
Polyadenylation is an important process of gene expression in eukaryotes.Polyadenylation sites(Poly(A)sites)can identify the end of gene,it is beneficial to correctly identify mature mRNA.One gene may have more than one poly(A)sites,using different poly(A)sites can change the mRNA.Whole genome 3'end sequencing technique made a lot of sequences which contain poly(A)informatio,now the problem for biologists is how to extract the poly(A)sites from whole genome quickly and efficiently.For analyzing the WTTS-seq data,We developed a pipeline to extract poly(A)sites from whole genome,and the result is the foundation for biologists to further analysis.We clustered the poly(A)sites according to the distance of them;When a poly(A)site covered several relative positions at the same time,We defined a certain priority to decide which relative position the poly(A)site is covering;We made it by writing a series of Perl program and using a third-party bioinformatics software,and it can be extended for other sequencing data.We used sliding windows to scan the upstream and downstream sequence of the poly(A)sites,for analyzing the distribution of the poly(A)signals and whether there is a A-rich stretch(ARS).We analyzed 160 millions of sequences from 16 samples in Xenopus tropicalis,they are from different embryo stages and different gender.We extracted about 70,000 poly(A)sites at embryo stages and 60,000 poly(A)sites at males and females.The results showed that the number of poly(A)sites decreased from stage6,then it increased from stage8,it is in accord with the Maternal-to-zygotic transition(MZT)process.We found that there is difference in usage of poly(A)sites between males and females.According to the results of the usage of poly(A)sites in cattle,chicken,rat and xenopus tropicalis,we did composite analysis,we found that different gene biotypes favor different class of poly(A)sites.All in all,this pipeline has wide applicability,can run very fast and has high reliability,so it can help biologists have better understandings of poly(A)sites' usage at different spieces,developmental stages,genders and environments,then understand gene expression and protein diversity well.
Keywords/Search Tags:Poly(A)Sites, Extraction, Application
PDF Full Text Request
Related items