Font Size: a A A

TE Contents, Numbers Of Transcripts And Disease Susceptibility Of The Human Genes

Posted on:2016-09-22Degree:MasterType:Thesis
Country:ChinaCandidate:D F CaoFull Text:PDF
GTID:2310330503494333Subject:Biology
Abstract/Summary:PDF Full Text Request
Comprising nearly half of the human genome, transposable elements(TEs) are found within most genes though the proportion of TEs in protein coding regions(CDS regions) is only ~1%. However, this number may be underestimated. In this study, we systematically analyzed the variation of transposable elements in CDS regions in different public genome annotation databases. In Refseq database, 0.43% of CDSs are covered by TEs. However, this percentage reached 1.30% in Ensembl(GENCODE) database and the fraction of TEs in Ensembl-specific CDS regions(~15.3%) is much higher than the corresponding fraction in Ref Seq(0.43%). The transcription levels of database specific CDSs are significantly lower than those of CDSs in both Ref Seq and Ensembl.We further investigated the effect of TE on transcript number and disease susceptibility. Genes, containing more transcripts, have a higher TE content than those contain less transcript. We analyze the disease associated genes in OMIM database and found that disease associated genes contain more TEs than non-disease related genes. Recently recognized disease related genes have more TEs than the genes identified earlier. Overall, these results indicate that more protein coding genes are to be found in the future due to their high percentages of TE contents.
Keywords/Search Tags:Transposable Elements, Protein Coding sequence, Alternative Splicing, Disease Susceptibility, Gene Expression
PDF Full Text Request
Related items