Font Size: a A A

Development Of Long-read Sequencing-based Precision Screening And Diagnostic Technology For Complex Monogenic Diseases

Posted on:2024-02-27Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y D LiuFull Text:PDF
GTID:1524307310974039Subject:Genetics
Abstract/Summary:PDF Full Text Request
Background:Complex monogenic diseases refer to genetic disorders with complex pathogenic gene or variant and high phenotypic heterogeneity.It is estimated that there have been more than 400 such diseases,accounting for about 10%of all definite monogenic diseases with definite pathogenic genes.Many high-incidence and severe genetic diseases face great challenges in birth defect prevention and control and the Health China Strategy due to the complexity of their etiology and the limitations of existing testing technologies.Among them,congenital adrenal hyperplasia(CAH)is a group of autosomal recessive genetic diseases caused by a defect in the encoded genes of enzymes and accessory factors in the adrenal cortex sterol synthesis pathway,which can lead to death in severe cases due to salt wasting crisis.Severe cases can lead to death due to salt-wasting crisis.Fragile X syndrome(FXS)is the most common monogenic diseases causing inherited intellectual disabilities and autism spectrum disorders,with an incidence rate second only to Down syndrome.Although the above two diseases have been included in the main targets of the National Birth Defects Prevention and Control Program,the main causative gene CYP21A2 of CAH is located in a complex genome structure region with multiple pseudogene interference,while FXS is mainly caused by the(CGG)n repeat expansions in the non-coding region of FMR1 gene exon 1.The current detection strategies for these two diseases require at least two molecular tests to complement or verify each other,and the detection process is cumbersome and time-consuming,but it is still difficult to achieve accurate and efficient requirements of gene screening and diagnosis.Recently,the development of long-read sequencing(LRS)has brought unprecedented opportunities to break through the bottleneck of molecular diagnostic technology for these diseases,but there have been no reports of CAH and FXS single precise detection technologies based on LRS applied for comprehensive detection in clinical settings yet.Objectives:Developing a technology system of targeted genes based on long-read sequencing(LRS)and establishing comprehensive analysis methods for CAH and FXS(CACAH/CAFXS),to break through the limitations of traditional genetic testing methods in disease screening and diagnosis,and to respectively achieve the one-step accurate、comprehensive and low-cost detection of these two complex monogenic diseases.This research provides a powerful molecular diagnostic tool for screening and diagnosis,treatment intervention,and genetic counseling of patients and family members.Methods:I.Research Methods for CAH1.Sample collection:Clinical data and samples of suspected CAH patients and their family members from 2019-2021 will be collected for molecular diagnosis using the current detection methods,for retrospective clinical research on CACAH.2.Current detection method:The main causative gene CYP21A2 will be detected using the multiplex ligation-dependent probe amplification(MLPA)method for detecting gene deletions,combined with Sanger sequencing for point mutations.Other CAH-related causative gene mutations will be detected using a panel for abnormalities in sexual development.3.CACAH detection method3.1 Methodology establishment:Through amplification enzyme screening,primer testing,barcodes screening,multiple sample types testing,and DNA template input testing for targeted sequence long range PCR(LR-PCR),achieve full-length specific and efficient amplification of5 common CAH-causing genes(CYP21A2,CYP11B1,CYP17A1,HSD3B2,and St AR),and then construct a dumbbell-shaped long fragment amplification library,single-molecule real-time(SMRT)sequencing of mixed samples with different barcodes,develop a bioinformatics analysis pipeline using Free Bayes and self-made algorithms,thus construct a LRS-based CAH detection and analysis system.3.2 Performance verification:A single-blind retrospective study will be conducted using CACAH on sample cases previously confirmed by the current detection method to carry the CAH pathogenic mutation or to be a patient.By comparing with the current detection method,the accuracy and clinical utility of CACAH detection will be determined.II.Research Methods for FXS1.Sample collection:Clinical data and samples were collected from patients suspected to have FXS and their family members from 2019 to2021.The samples were molecularly diagnosed using current testing methods,and used for subsequent retrospective clinical research on CAFXS.2.Current detection method:FMR1 detection uses Southern blot analysis(SBA)to detect the(CGG)_n repeat region and methylation,combined with triplet repeat-primed PCR(TP-PCR)to detect specific(CGG)_n repeat numbers of up to 200.3.CAFXS detection methods3.1 Methodology establishment:Through the screening of additives and primers,optimization of amplification conditions,primer testing,inner Barcode screening for high CG PCR(CGG reaction),and Plus reaction system testing,comprehensive coverage of FMR1(CGG)_ntandem repeat,AGG insertion,point mutation,deletion/insertion mutation is achieved.Then,sequencing data is obtained through library construction and SMRT sequencing,and the sequencing data is analyzed using the kernel density estimation(KDE)method to identify peak values of(CGG)_n sequences in CCS reads.Finally,a bioinformatics analysis pipeline is established.Thus construct an LRS-based FXS detection and analysis system.3.2 Sample validation:Human genomic DNA standard samples(Coriell DNA samples)are used to verify the sensitivity and accuracy of the detection method.Normal control samples and Coriell DNA samples with different(CGG)_n repeat numbers are mixed for chimera analysis.Finally,the detection efficiency is evaluated using clinical family samples with molecular diagnosis.Results:I.CAH Research Results:1.Sample collection:37 cases of CAH were collected,including 14dried blood spot samples from newborns and 23 peripheral blood samples.Among them,pathogenic mutations in CYP17A1 and HSD3B2 genes were detected in 2 samples by sex development abnormality panel.2.Current gene testing methods:By combining MLPA and Sanger sequencing analysis,29 samples were diagnosed as CAH patients,4samples were identified as carriers of CYP21A2 gene mutations,and one family’s proband and mother both had 3 copies of the CYP21A2 gene detected by MLPA.Sequencing analysis showed that they respectively carried 2 and 1 heterozygous point mutations,but the distribution of the point mutations on the 3 copies could not be determined.3.CACAH Detection:3.1 Methodology establishment:A novel amplification system consisting of 6 pairs of primers was designed to comprehensively cover the CAH target sequence.An efficient amplification system was established through extensive screening and testing of primers,enzymes,and reaction conditions,and barcode optimization was performed during library construction.SMRT sequencing was used to obtain reliable data results for all samples,with a total of≥200 CCS reads obtained for the CYP21A2 and CYP21A1P genes,and≥30 CCS reads obtained for the other 4 genes.3.2 Clinical utility evaluation:In comparison to the combination of MLPA and Sanger sequencing for CAH detection,CACAH accurately detected all genotypes of CAH in one test,with 100%specificity and sensitivity.CACAH accurately detected 30 kb deletions,precisely discriminated between true and false genes in all samples,and identified the subtypes of chimeric genes and the junction sites of deletions/insertion.The method also accurately recognized cis–trans configuration of multiple variants without analyzing family samples.CACAH identified a case carrying 3 copies of CYP21A2 which was not confirmed by MLPA plus Sanger sequencing.In another case,due to the improved resolution of breakpoints by CACAH,the parents of the patient received more accurate genetic counseling and reproductive guidance.II.Results of FXS samples:1.Sample collection:A total of 62 clinical samples were collected from 21 FXS families,including 57 peripheral blood samples and 5amniotic fluid samples.2.Current methods of genetic testing:SBA and TP-PCR have both confirmed molecular diagnosis in 18/21 families,detecting either permutations or full mutations.For the other 3 families,Sanger sequencing detected a point mutation for proband in one case,while stepwise PCR amplification detected large fragment deletions for probands in the other two cases.TP-PCR identified premutation alleles with 161-197 CGG repeats in 5 samples,but was not recognized by SBA,which showed consistent results with the rest of the samples tested using both methods.3.CAFXS testing3.1 Methodology establishment:The CGG reaction was designed to cover(CGG)_n tandem repeat regions,and the Plus reaction covered other types of variations,such as point mutations and deletions of the FMR1gene itself.Through extensive screening and optimization of PCR additives,primers,and reaction conditions.After SMRT sequencing,quality control standards related to the genotype were established.Samples with only full mutation alleles had CCS reads≥100,while samples with other types of mutations or chimeras had CCS reads≥2000.For CCS reads containing SNVs,In Dels,or large gene deletions in FMR1,the number of CCS reads≥30.The data results of the developed CAFXS method were reliable.3.2 Clinical Utility Assessment:In the g DNA standard,Coriell DNA with a maximum of 940 CGG repeats was accurately detected,and premutation/full mutation chimeras as low as 0.5%were detected in artificially titrated samples.The analytical sensitivity was 2-4 times higher than that of TP-PCR.And accurately revealed the AGG insertion pattern of all samples,including heterozygotes and chimeras.In clinical samples,the CGG repeat numbers and AGG insertion quantities and positions of all 133 alleles(including chimeric alleles)were accurately detected.In addition,two gene variants affecting TP-PCR primer binding and two samples respectively carrying large deletions of 237.1kb and774.0kb,including the entire FMR1 gene,were accurately detected.Conclusion:1.The new CACAH detection technology based on LRS overcomes the limitations of existing detection methods in terms of incomplete coverage of mutation types,difficulty in distinguishing true and false genes,and difficulty in determining the cis/trans relationship of mutations.It achieves comprehensive and accurate genetic typing of the five major causative genes for CAH,accurately identifying true and false genes and identifying chimeric gene subtypes.It provides powerful technical support for timely and accurate molecular diagnosis of CAH.2.The new CAFXS detection technology based on LRS breaks down technical barriers such as incomplete coverage of mutation types and difficulty in precisely testing the(CGG)_n tandem repeat region.It can detect all types of FXS mutations,including(CGG)_n expansions and their AGG insertions,which account for 99%of FXS pathogenesis.In addition,it can detect rare gene variants and large gene deletions in FMR1,and detect premutation/full mutation low-proportion chimeras as low as 0.5%.It provides a powerful molecular detection tool for FXS carrier screening and disease molecular diagnosis.3.Taking CAH and FXS as representatives,the development of long-read sequencing-based comprehensive analysis technology for targeted complex monogenic diseases achieves fundamental improvements in disease screening and diagnosis methods.It has significant advantages of comprehensive,accurate,high-throughput,one-step,and cost-effective,demonstrating a wide range of clinical application prospects.It is an important exploration for constructing a new LRS technical system for birth defect prevention and control in China.
Keywords/Search Tags:Complex monogenic disease, Congenital Adrenal Hyperplasia(CAH), Fragile X syndrome(FXS), Long-range PCR(LRPCR), Long-read sequencing(LRS)
PDF Full Text Request
Related items