Lung cancer is the most prevalent cancer and the most common cause of cancer-related death in China.Early diagnosis of lung cancer is of great help to improve prognosis and reduce socioeconomic burden.However,it has always been important but challenging to early differentiate the nature of pulmonary nodules.Conventionally,the diagnosis is confirmed by combined medical imaging and biopsies followed with pathological evaluation,and the latter one is the gold standard.The shortcomings of biopsy-dependent pathological diagnosis are also obvious,including invasive,nonrepeatable,and limited by anatomic complexity of the nodules.With the development of medical imaging,low-dose computed tomography(LDCT)and high-resolution CT(HRCT)scan have become increasingly informative and helpful.Yet the mediocre sensitivity and specificity of current imaging technique,due to many-to-many mapping between the nature of nodule and pattern of image,still not able to meet clinical demands.A relatively novel idea is to measure the trace tumor markers in bloodstream,such as epigenetic modification level of circulating tumor DNA(ctDNA).While its application is limited by high cost and rather low positive predicting value(PPV).In recent years,emerging of artificial intelligence(AI)strengthens the robustness of determination of pulmonary nodules for early diagnosis.A common approach is to utilize image data as input to integrate AI into existing diagnostic process,which comes with problems such as requirement for sufficient computing power,uniform formatting of image data,encryption and circulation among medical facilities.Therefore,we explored a de novo methodology to integrating AI into diagnosis process,that is,to establish an interpretable model for automatic differentiation and determination of the nature of pulmonary nodules.We enrolled 142 patients with pathologically confirmed thoracic masses,and constructed a pretrained model(PTM)based on the descriptions of their chest CT reports.In contrast with convolutional neural network(CNN)model,our model projected improved precision and recall value.For benign nodules,the precision is 0.67 and the recall rate is 0.46,with F1 score is 0.55;for malignant nodules,precision is 0.65 and the recall rate is 0.81,with F1 score of 0.72).For samples that are extremely skewed to malignant or benign,the model showed perfect classification.In addition,we also extracted interpretable classification features,which can provide the basis for differential diagnosis for clinical practitioners.To date,this study is the only pioneer to used free text from chest imaging as the sole input for the differential diagnosis of pulmonary nodules.Though being a single-center trial with a small number of enrollments,the AI-assisted differentiation model could advance with more data sets involved in the future.Lung cancer ranks the first among all kinds of malignancies in terms of prevalence and mortality,and an estimated 1.8 million patients died from lung cancer in 2021.Multiple primary lung cancer(MPLC)refers to the occurrence of two or more lung cancer lesions in different anatomic loci within one patient.Defined by the time of occurrence.MPLC is further divided into simultaneous MPLC and metachronous MPLC.Since most patients with MPLC have histopathological correlations between difference lesions,carefully identification of MPLC and intrapulmonary metastasis of lung cancer is important for tumor staging,treatment selecting,and prognosis predicting.The diagnosis of multiple primary lung cancer mainly relies on the clinical diagnostic criteria proposed by Martini and Melamed in 1975 and the American College of Chest Physicians’ 2007 diagnostic guidelines.The most reliable and feasible method for differentiating MPLC and intrapulmonary metastasis is comprehensive pathological evaluation and histological features among multiple lesions.At present,the pathogenesis of multiple primary lung cancer remains unclear,and studies have made progress in the aspect of etiology,gross pathology,microscopic pathology,and clinical prognosis of MPLC.However,due to the small number of patients,limited inspection methods,and relatively ineffective experimental tests,small-scale studies of MPLC at the molecular level have only been gradually carried out in recent years.Methods such as genome sequencing,DNA methylation detection,chromosomal rearrangement,and proteomics can all be used to explore the pathogenesis.Many studies mainly included samples from multiple lesions of same pathological subtype within the same patient,while studies on lesions from a single patient of different pathological subtypes are very limited.In our study,a total of 53 biopsies from 16 patients of MPLC that have different pathological subtypes were recruited.We performed whole-exome sequencing and methylation microarray analysis to analyze the driver genes,differential gene expression,mutational signatures,mutational phylogenetic trees,methylation principal cluster analysis(PCA).Methylation level correlations and methylation-based clustering of patients were also investigated.Our results showed that most patients with MPLC have few common mutations shared by different lesions,and each lesion is more likely originate from somatic mutations independently,which is in consist with overexpression of proteins related to somatic hypermutation.A small number of patients have more gene mutations overlapped among different lesions of either identical or discriminated pathological types.This implied that these patients may had primary lung cancers with intrapulmonary metastasis with or without pathological transformation but were misdiagnosed with MPLC.It further suggested that assistance of molecular biology approaches could improve the efficiency of current diagnostic criteria for MPLC.In addition,the molecular-level pathogenesis of multiple primary squamous cell carcinoma and multiple primary adenocarcinomas may be completely different.The former is more driven by epigenetic alterations,while the latter is directly driven by abnormalities at the gene level.Subsequent inclusion of more patient samples will help to further explore the etiology,diagnosis and treatment of MPLC. |