| Lung cancer has occupied the first place in the incidence of malignant tumors.Early diagnosis and effective treatment of lung cancer are crucial to reduce the mortality rate and improve the survival rate.Traditional lung cancer detection methods,such as lowdose computed tomography or positron emission tomography,are usually expensive,sensitive and radioactive to human body.Exhalation analysis technology based on electronic nose is rapid,non-invasive,simple to operate and low cost,which is expected to be used in early screening of lung cancer.However,the current electronic nose mostly ignores the optimization of air chamber and sensor array and few studies on the staging of lung cancer patients.In addition,the pattern recognition framework has a great influence on the recognition effect of electronic nose system.Therefore,in view of the above problems,this study designed a piston chamber,and studied the sensor array optimization and pattern recognition algorithm,in order to improve the performance of the electronic nose system to identify lung cancer.In this study,17 gas sensors and two temperature and humidity sensors were selected according to the markers and concentrations of lung cancer in the exhaled gas to form a differentiated sensor array.A new type of piston chamber was designed,which can exhaust the residual bottom gas in the chamber,thus avoiding the interference caused by the ambient bottom gas on the detection results.Breath samples from 198 lung cancer patients and 232 healthy volunteers were collected for analysis.The linear discriminant analysis method was used to select 11 gas sensors with the highest weight from the original sensor array to form the optimal sensor array,which was used to identify lung cancer.In the classification and recognition of lung cancer,principal component analysis and kernel principal component analysis were used to extract features.Logical regression analysis,support vector machine,decision tree,random forest and k-nearest neighbor regression were used to classify and recognize breath samples.It was found that the single classifier had the disadvantages of low accuracy and high false negative rate.Therefore,the PCA-SVE ensemble learning framework composed of the five base classifiers and the Adaboost ensemble classifier were applied to the electronic nose system to identify lung cancer.Experiments showed that the ensemble classifier can make up for the shortcomings of the single classifier and improved the performance of lung cancer recognition.The PCA-SVE framework had the best performance of lung cancer recognition,and its accuracy,sensitivity,specificity and F1 values were 97.91%,97.85%,97.98%,98.05%,respectively.Validation data sets were used to verify the stability and generalization ability of each classifier.The PCA-SVE framework was applied to the identification of lung cancer patients with different clinical stages(65 stage Ⅱ,70 stage Ⅲ,and 63 stage Ⅳ).The results showed that the electronic nose system in this study was able to distinguish lung cancer patients with different stages with an accuracy of 84.61%.In this study,the linear discriminant analysis method was used to select 11 gas sensors with the highest weight to form the optimal sensor array.The PCA-SVE ensemble learning framework and Adaboost ensemble classifier were applied to the classification and recognition of 430 breath samples collected.The results showed that the array optimization algorithm and PCA-SVE ensemble classifier can improve the performance of the electronic nose system in the identification of lung cancer. |