| 【Background】Subsolid nodules are a type of lung nodule characterized by ground-glass opacities on chest computed tomography(CT)images,which have garnered widespread attention in the classification of pulmonary nodules.With the rapid development of modern medical imaging technology and the promotion of CT-based lung cancer screening,the detection rate of SSNs,including sub-centimeter-sized nodules,has significantly increased.For many years,SSNs were considered as radiological manifestations of pneumonia until it was revealed in recent years that they could also be an early manifestation of lung cancer.Lung cancer,which is the most common cause of cancer death worldwide,with adenocarcinoma being the most common type,has become an important heat topic in public health research due to the need for early diagnosis.The early diagnosis of sub-centimeter subsolid invasive lung adenocarcinoma is crucial for improving the quality of life and ensuring the survival of patients,as the optimal treatment strategies vary depending on the degree of invasion.【Objective】Compared to traditional diagnostic methods for sub-centimeter subsolid invasive lung adenocarcinoma,we aim to construct a CT-based prediction models using radiomics and machine learning techniques,and evaluate and compare its diagnostic performance and clinical utility.【Methods】 We conducted a retrospective study wherein we collected data and samples of surgically confirmed adenocarcinoma spectrum lesions that presented as sub-centimeter subsolid pulmonary nodules.We developed three predictive radiomics models using different machine learning classifiers to discriminate the invasiveness of the sub-centimeter subsolid pulmonary nodules.【Results】 Our study included 203 sub-centimeter nodules from 177 patients,which were randomly assigned to either the training set(n=143)or the test set(n=60).A total of 1781 radiomic features were extracted and 10 features with the highest predictive value were selected.Three prediction models were constructed,and the areas under the curve for the predictive models in the training set were 0.743(95% confidence interval [CI]: 0.661–0.824)for logistic regression,0.828(95% CI: 0.76–0.896)for support vector machine,and 0.917(95% CI: 0.869–0.965)for the XGBoost model.In the test set,the areas under the curve were 0.803(95% CI: 0.694–0.913),0.726(95% CI: 0.598–0.854),and 0.85(95% CI: 0.776–0.972)for the respective models.The predictive performance of the three models on the test set was not significantly different from that on the training set(P>0.05).Furthermore,decision curve analysis showed that the XGBoost model had higher net benefit in the threshold probability range of 0.06 to 0.93,and outperformed the logistic regression and support vector machine models.In conclusion,among the three radiomics-based machine learning prediction models,the XGBoost model demonstrated the best performance.【Conclusion】Our study demonstrates that utilizing radiomics features and machine learning models for the prediction analysis of invasiveness in sub-centimeter subsolid pulmonary nodules exhibits good performance.This indicates that radiomics features can reflect the heterogeneity and invasiveness of these nodules.These models could provide a non-invasive and convenient method to assist clinicians in diagnosing pulmonary nodules. |