Font Size: a A A

Detection Method Of Cancer - Driven Pathway Based On Second - Generation Sequencing Data

Posted on:2017-02-22Degree:MasterType:Thesis
Country:ChinaCandidate:H T LiFull Text:PDF
GTID:2174330485983946Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
High-throughput technologies have enabled researchers to explore a large variety of biological and biomedical problems at the genome-wide scale and have generated huge amounts of biological data. These technologies include microarrays(e.g., gene expression, copy number variation, genome-wide association studies, microRNA, and methylation), next generation sequencing(e.g., RNA-Seq, whole exome sequencing, and whole genome sequencing), and ChIP-Seq. Analyses of the data generated from these technologies often result in a list of noteworthy genes that are useful for biological interpretation and follow up validation.Cancer is often driven by the accumulation of genetic alterations. Recent advances in second-generation sequencing technologies have helped investigators generate massive amounts of cancer genomic data. Several studies have detected some important gene mutations in cancer progression, but they can’t capture the heterogeneity of genome aberrations. Therefore, it is necessary to shift the point of view from gene to pathway level, which is helpful to capture the heterogeneous patterns in cancer. To meet this rapidly growing demand, appropriate bioinformatics tools must be developed.In this thesis, we focus on identity mutated driver pathways algorithm to study based on second-generation sequencing technologies, and proposed an effective algorithm elaborated algorithm process, and is compared with the conventional algorithm. The main study works of this paper are summarized as follows:Firstly, we introduce a modified method to solve the so-called maximum weight submatrix problem which is used to identify mutated driver pathways in cancer. The problem is based on two combinatorial properties, that is, coverage and exclusivity. We proposed an optimization and heuristic algorithm, that is, simulated annealing hybrid genetic algorithm, which is named SAGA. Particularly, we considered incorporating the gene expression data into SAGA method to improve its performance, which achieved satisfactory results.Secondly, under the molecular network framework, a genetic aberration may cause network architectural change by affecting or removing a node or its connection within the network or by changing the biochemical properties of a node. We proposed DriverFinder algorithm, which identified gene expression outliers in tumor samples separately for cancer and noncancer genes, and filtered out potential long genes whose mutations likely occur by chance based on fitting them to a generalized additive model. Relatively large numbers of experimental results show that the algorithm is effective.Finally, the works in this thesis are briefly summarized and reviewed, and further research works are discussed and raised.
Keywords/Search Tags:Second-generation sequencing technologies, Cancer, Driver pathways, Driver mutations
PDF Full Text Request
Related items