Font Size: a A A

Identification Of Fusion Genes In Human Tumor Based On RNA-Seq Data

Posted on:2017-11-03Degree:MasterType:Thesis
Country:ChinaCandidate:Q ChenFull Text:PDF
GTID:2334330509962815Subject:Precision instruments and machinery
Abstract/Summary:PDF Full Text Request
Fusion genes are formed from two separate genes, which are created by chromosomal translocation, deletion or inversion. Activation of proto-oncogenes or inactivations of tumor suppressor genes or encoding oncogenic fusion protein, which are caused by fusion genes, are main causes of occurrence and development of cancer. Fusion genes have been confirmed in thyroid, breast, lung and other cancers in numerous studies. Identifying fusion genes and characterizing the expressed protein products may provide new potential opportunities for understanding cancer tumorigenesis, developing novel diagnostics, cancer diagnosis and targeted therapeutics.Next generation sequencing technology generates highthroughput data, which allows us to detect fusion genes at genomic levels. Here, we present an alternative fusion detection algorithm, GFusion, which is designed to capture fusion genes by analyzing the formation structure of the fusion gene and the difference of fusion reads with normal reads. Firstly, GFusion use Bowtie, Tophat and other software to align reads to the reference genome, and report the alignments in a modified SAM format. We create artificial pair-end reads from unmapped reads which can be found in SAM files. Then the artificial artificial reads are realigned to the reference genome and lacated in human gene annotations to get gene name. After multiple filtering steps, candidate fusions reads would be identified. Finally, the candidate fusion reads are realigned to the rebuilded bowtie index to find true fusion genes. Compared to existed softwares: Tophat-Fsuion and FusionMap, the results reported by GFusion are more reliable because of multiple detection strands, especially considering the information of mapped ends and rebuilding bowtie index.We have demonstrated the effectiveness of the GFusion on three breast cancer cell lines, a normality beast and K-562 datasets. In the results, most of previous identify fusions were successfully reported. For the three breast cancer cell lines, 20 out of the 23 previously reported fusions were found by GFusion. For CML K-562 cell line, we also reported the BCR-ABL1 fusion genes, which is consistent with the research on the pathology of chronic myeloid leukemia. We evaluated the performance of GFusion, Tophat-Fusion and FusionMap on simulation which contained normal background reads and artificial fusion reads. Compared with the published softwares, GFusion shows higher sensitivity and lower false discovery rate by the advantage of algorithm when considering the information of mapped ends. It suggests that GFusion is an optimized solution in the comprehensive detection of fusion genes.
Keywords/Search Tags:fusion genes, human cancer, next-generation sequencing, alignment, detection algorithm
PDF Full Text Request
Related items