| With the continuous development of information technology,the continuous improvement of information technology and the popularity of digital medical devices,a large number of health big data are generated.In order to effectively use these health big data,a large amount of useful information is mined from them,and machine learning and deep learning algorithms are widely used.Ensemble learning algorithm is a widely used machine learning algorithm.The algorithm first trains a group of base models,then integrates the base models,and takes the integration results of multiple models as the results of the final model.In the health big data diagnosis and treatment problem,the ensemble algorithm is widely used.Using this kind of algorithm,we can generally get more robust results.This paper studies the solutions of different ensemble algorithms to the health big data diagnosis and treatment problem from different perspectives.This paper studies several key issues,including:(1)the ensemble pruning problem of deep learning base model: the purpose is to remove redundant deep learning base models,On the one hand,it can reduce the prediction time in practical application,on the other hand,it can eliminate the negative effects brought by redundant base models and improve the algorithm performance;(2)Integration of multiple epochs of the self-ensemble algorithm: the purpose is to solve the problem of inconsistent distribution of test set and validation set,and eliminate redundant epoch models and improve the accuracy of the algorithm by ensemble and ensemble pruning strategies;(3)Multi deep learning model ensemble problem: in medical diagnosis,single model prediction will increase the probability of missed diagnosis,integrating heterogeneous deep learning models and integrating their characteristics can reduce the probability of missed diagnosis;(4)The problem of multi-source data model integration: multi-source data has different perspective characteristics.Integrating multi-source data can improve the prediction performance.Different data sources have different data characteristics.By giving different models different weights and redundancy features to remove,multisource data results can be integrated to achieve better classification accuracy.In view of the above problems,this paper makes the following contributions:(1)In this study,an acne staging algorithm based on ensemble algorithm and ensemble pruning technology is proposed to remove redundant base models,reduce the consumption of computing resources,and improve the accuracy of the algorithm.This method carries out an ensemble pruning process through two stages.First,the model set is sorted by the richness between models and the accuracy of single models.The first n models are selected as the subset of models selected in the first stage.The higher the accuracy of single models,the higher the richness between model pairs,and the more important the models are.In the second stage,the prediction results of the model are used to construct features.Each feature represents a model.The feature selection algorithm selects the best feature subset as the model subset selected by ensemble pruning.Finally,the classifier integrates the results of multiple models.Compared with the traditional voting method and weighted average method,this method can obtain better prediction performance.Through this model,the state-of-the-art performance is achieved in the problem of whelk staging.(2)In the previous chapter,training multiple base models for integration brings huge computing resource consumption in the training process.To solve this problem,in this study,an acne type classification method based on self-ensemble is proposed,which integrates the epoch results of multiple models to solve the problem that the use of validation set to select epoch is not perfect.First of all,the convolution neural network is trained using the whelk image data to save multiple epoch results.Compared with the traditional method of selecting the best epoch,this study integrates all epochs,greatly improving the generalization of the algorithm.Secondly,because some epochs are not fully trained,they may play a negative role in the final integration results.Feature selection algorithms are used to remove redundant models and improve the algorithm performance.Finally,when integrating the model results,compared with the voting method or the weighted average method,this paper uses the learnable classifier to predict the model results to achieve the integration purpose.The prediction performance of this study is higher than that of some dermatologists with short clinical time.At the same time,its performance is also higher than that of some mainstream deep learning algorithms and classic machine learning algorithms for manual feature extraction.(3)The above integration of homogeneous model results cannot integrate the advantages of heterogeneous models to achieve better functions.This research integrates multiple deep learning model to detect human nevus.At present,many melanomas are evolved from moles.It is difficult to distinguish between the two,which will cause patients to not go to the hospital in time to diagnose.It is a good method to use mobile phones to take pictures and send them to doctors for remote diagnosis.However,because the moles are very small,doctors sometimes miss diagnosis and misdiagnosis.In order to better reduce the rate of missed diagnosis and reduce the workload of doctors,this study integrates multi object detection algorithms and segmentation algorithm,First,use the multi object detection algorithms to detect the skin nevus,use the NMS algorithm to integrate the multi object detection results,and combine the characteristics of different object detection algorithms to achieve the effect of reducing the rate of missed diagnosis.Secondly,using image segmentation method,the nevus edge is segmented,and a visualization software is formed to assist doctors in diagnosis,and a diagnosis report can be formed,which greatly reduces the workload of doctors and reduces the probability of missed diagnosis.(4)The above prediction is mainly based on a single data source.The integration of multi-source data may lead to better prediction performance.This study integrated transcriptome features and image features to predict melanoma metastasis.Melanoma metastasis will seriously endanger the lives of patients.Compared with other methods of predicting melanoma metastasis using single source data,this paper integrates protein-coding RNA,long non-coding RNA and pathological image data to predict melanoma metastasis.Different source characteristics provide different perspectives.Integrating multi-source data can obtain more generalized prediction results.Because transcriptome features have high dimension and redundancy,this study uses feature selection algorithm to remove redundant features and extract feature subsets with more information.For pathological images,due to its large amount of data and large memory occupation,this study uses software to extract the proportion of total cells occupied by various types of cells in the image as a feature,and uses neural networks as a classifier.Finally,through the method of linear weighting,the results of different data source models are integrated to improve the prediction accuracy.This paper aimed to integrate algorithms to develop solutions to diagnosis and treatment of typical disease by using health big data.The integrated algorithms have shown stable performance with high accuracy. |