| Solid tumor has already become one of the main diseases affecting the life span and quality of life of Chinese residents as the morbidity and mortality of it continue to climb.In recent years,the precision medicine,based on the characteristics of cancer genome offered by high-throughput sequencing(HTS),is able to customize unique treatment plan for each solid tumor patient,showing a bright and clear prospect to benefit more patients.Among sequencing strategies of HTS in clinical practice,large panel(covering~1Mb genome)is more cost-effective and feasible than other strategies like Whole Genome Sequencing(WGS),Whole Exome Sequencing(WES)and Hotspot detection.It concentrates on genomic regions where pathogenic mutations appear,and keep the ability to obtain enough genomic information for advanced assays.Compared to traditional molecular methods,the frequently-used tumor-normal large panel method needs a much more complex process.First,sufficient qualified tumor tissue and para-cancerous tissue need be provided.Then the paired samples will go through multiple error-prone steps of "wet bench" and "dry bench" process.This leads to inaccurate results in clinical testing and poor consistency among laboratories,impeding the application and development of precision medicine in clinical practice.Meanwhile,the somatic mutation detection of large panel assay of domestic sequencing laboratories has not been investigated and evaluated yet.In order to evaluate and improve the quality of them,proper reference tumor reference samples with clearly constructed variant dataset are prerequisite for the corresponding external quality assessment(EQA).However,the existing reference samples and their variant dataset fail to meet the demand of large panel EQA program.Clinical biopsy smear sample is unsuitable for largescale EQA program because the quality of it is sensitive to embedding treatment and preservation time and the unruly tumor purity and tumor heterogeneity would lead to nonideal inter-laboratory consistency.Spike-in method and gene editing method to introduce variants point-by-point are both uneconomic and not interchangeable with patient samples.The germline method,which entails using easily-measurable germline variants from several tumor cell lines to stimulate somatic mutations can provide enough variants but it’s irrelevant to the mechanism of tumorigenesis and development in clinical tumors,and has questionable applicability in clinical somatic mutation research.Available commercial cell lines either have no paired normal cell line or have limited mutations and allele frequency(AF)range.More importantly,as most laboratories develop their methodologies based on these cell lines,it is familiar and biased to launch a blinded EQA program with them.In this study,we developed a series of tumor-normal paired reference samples,using effective building and feasible validating methods,which were based on easily available normal cell lines that were close to real-world clinical tumors.To introduce abundant somatic variants,we knock down the key MLH1 and MLH2 gene in mismatch repair pathway,as well as the proofreading-associated POLE gene by CRISPR-Cas9 technology.This could stimulate the deficient mismatch repair(dMMR)situation on clinical tumor tissues.After several months of passage and culture,somatic variants were generated in the edited sub-clones.To establish creditable variant dataset of reference samples and evaluate the large-panel detection ability of sequencing laboratories,we practiced an integrated method to easily determine somatic variants and measure detection results simultaneously.WES and reliable oncopanel results selected from those with high sequencing depth,perfect reproducibility and superior concordance with WES results were used to determine positive somatic variants in our reference samples.Finally,14 pairs of edited tumor-normal reference samples with variants dataset containing 168 somatic variants and covering 1.8Mb genomic region were generated.Within the range of variant allele frequency≥5%,1306 errors,including 729 false negatives(FN),179 false positives(FP)and 398 reproducibility errors were reported by 56 participating laboratories.The performance metric varied among panels.Precision and recall ranged from 0.773 to 1 and 0.683 to 1 respectively.Incorrect filtering accounted for a large proportion of false discovery(including FNs and FPs),while low-quality detection,cross-contamination and other sequencing errors during the wet bench process were also sources of errors.In addition,low AF(5%)considerably influenced the reproducibility and comparability among panels.In summary,we developed a kind of dMMR tumor-normal paired cell lines by CRISPR-Cas9 technology.Compared to former-existing reference material,it contains abundant,changeable and measurable somatic variants,consistent with the mechanism of tumorigenesis in clinical tumors.From the later EQA,we precisely established variant dataset of reference samples and evaluated the somatic variant detection ability of large panels from laboratories.The common errors from detection results were analyzed and traced.Our research would help sequencing laboratories develop,validate and optimize their large panels. |