Thyroid cancer is one of the most common diseases in human endocrine system.Ultrasound examination can identify suspicious thyroid nodules which is difficult to be detected by palpation,and is an effective means to realize early thyroid cancer.The hospital ultrasound department information system has stored a large number of thyroid patients’ visit data,containing a wealth of medical information.It is of practical significance to mine and analyze the thyroid data of ultrasonic department with scientific methods and obtain effective information to assist doctors in improving the accuracy of diagnosis.This study focused on the decision making of thyroid nodule auxiliary diagnosis.Based on the thyroid ultrasound data in the information system of the cooperative hospital,it aimed at assisting doctors to improve the accuracy and efficiency of the diagnosis of benign and malignant thyroid nodule.The research contents mainly included the following points.First of all,based on ultrasonic thyroid data,combined with the professional knowledge of doctors to get the standards effective data from raw data by preprocessing.Then the effective thyroid data is comprehensively analyzed from four dimensions: overall description,single indicator and pathological results,multiple indicators and pathological results,and indicators,so as to preliminarily understand the characteristics of thyroid data.Secondly,according to the characteristics of thyroid data,PCA was used to eliminate the correlation between thyroid indicators and reduce the dimension of input items,and the individual classifier was compared with the ensemble learning algorithm.From the perspective of the combination of binary classification and multiple classification,an auxiliary diagnosis model of thyroid nodule based on PCA-ensemble learning was proposed.Finally,based on the effective thyroid data,from two angles as classification effect and the operation efficiency to comprehensively evaluate six kinds of algorithm,which are logistic regression,support vector machine(SVM),K neighbor method,decision tree,random forests and limit the gradient algorithm.Moreover,the differences between the models before and after PCA are compared to comprehensively verify the effectiveness of PCA-ensemble learning model in improving the accuracy and efficiency of doctors’ diagnosis.The experimental data shows that the average AUC value of PCA-ensemble learning model reaches the maximum of 0.96,and the maximum efficiency of PCA-ensemble learning model is improved by 47.7%. |