Font Size: a A A

Familiar Hypercholesterole Candidate Gene Identification Using Exome Sequencing And A Study Of SNP Filtering

Posted on:2012-06-16Degree:DoctorType:Dissertation
Country:ChinaCandidate:N WanFull Text:PDF
GTID:1224330467480021Subject:Bioinformatics
Abstract/Summary:PDF Full Text Request
Next generation sequencing technologies have been widely used in disease related research for its high-throughput, low cost, and convenience. However, it is still very costly to sequence a large number of individual genomes, which is much demanded in disease related gene identification. Exome targeted capture technologies, which can concentrate gene coding regions and reduce the sequence target from3Gb to30Mb, has been developed and combined with next-gen sequencing technologies to provide low cost individual DNA sequencing for research. Exome sequencing was soon broadly applied in disease causal mutation detecting studies. More than100novel disease related gene have been discovered through this technology since its first paper appeared in2009. Exome sequencing strategy was selected as one of the top10science breakthrough by Science magzine.Familiar hypercholesterolemia (FH) is an extremely dangerous disease to human being. It leads to atherosclerosis and other life threatening diseases. Asian population has an average affecting rate of1/900. Genetic mutations in genes like LDLR, etc. have been identified as the reason of FH, yet there are about30%patient who do not take mutations in these genes.Here we introduce a FH affected family. The previous studies on this family excluded the mutations in known disease genes. Linkage analysis identified two regions located in chromosome3and chromosome21. Heterogeneity is also observed in the sample. The proband has more severe symptoms than his parents, which indicates a loss of heterozygosity in his genome.We sequenced the exomes of the proband and his parents through exome sequencing. After reads alignment, SNP detection, filtering with dbSNP and1000genomes data and annotation, we screened out non-synonymous SNP, splicing SNP and coding insertion and deletion for next step analysis. We tested several possible disease causal mutation modals. We firstly checked de novo mutations, no related mutation has been found. Then we searched the common mutations among proband’s and his mother’s data in the linkage mapped locations, but found no novel or rare mutations. Disease related novel homozygous mutation was not found in the three samples too. We finally tested compound heterozygous genes and found novel mutations in4genes:ABCA13, EVC2, LOC653203, and STOX1. The ABCA13gene, which belongs to ABC transporter super family, is most possibly related to FH in these genes. Many ABC member genes have been reported to be related to lipid metabolism and several of them even related to familiar hypercholesterolemia. A validation of ABCA13in large sample is undertaken.We also investigated SNP detection methods. We found a very high false positive rate in the novel variants data-the variants with known variants filtered. Further study disclosed significant differences between the data patterns of known variants and novel variant data, which indicated the differences can be used to guide the elimination of the false positive in novel variant data. We then developed a novel self adaption SNP filtering algorithm to filter out false positives in the novel variant, according to the data patten differences. In this research we also developed a pipeline, including a SNP/Indel annotation program, to automatically analysis the data in a computer cluster.
Keywords/Search Tags:Exome Sequncing, Familiar Hypercholesterole, Rare Disease, MutationDetection, SNP Filter
PDF Full Text Request
Related items