Font Size: a A A

Research And Application Of Machine Learning Methods For Insufficient Training Samples In Medical Scenes

Posted on:2021-02-01Degree:MasterType:Thesis
Country:ChinaCandidate:W Y LiuFull Text:PDF
GTID:2404330629952638Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
In recent years,with the rise of machine learning technology,especially the vigorous development of deep learning technology,human society has ushered in the golden age of artificial intelligence.In the new era,a large number of health care data are created rapidly,and traditional medical care is gradually changing to intelligent mode.How to obtain valuable information from massive electronic medical records is the demand of the development of intelligent medical treatment,and also the major challenge of constructing intelligent medical system.Experimental results have shown that a large number of labeled samples are necessary to obtain a model with high accuracy and good generalization performance.However,in the medical field,there are often not enough case data for some diseases with lower incidence rate.How to generalize rare categories using a small number of case samples is not only a difficult problem in the field of intelligent medicine,but also a hot issue in the field of machine learning.It has a strong practical significance.There are two typical cases of insufficient training samples in the actual medical scenario:one is highly imbalanced case data,such as prenatal screening data.Such data are generally structured and difficult to learn because of low incidence rate and unknown correlation between features.The other is small sample medical image.Some diseases have many subtypes and the number of a rare subtype samples is small.When using the data-hungry deep learning framework to learn this kind of samples,the trained model usually has a very low recognition accuracy on test set due to over-fitting.The research work of this paper focus on the above two situations.For the study of machine learning methods for highly imbalanced data,this paper proposes a cascaded framework named CVIFLR.Considering that a single supervised or unsupervised learning method cannot achieve both low false positive rate and high detection rate,CVIFLR combines supervised and unsupervised learning methods in a framework.This method overcomes the disadvantages of traditional imbalanced learning methods based on resampling technology and improves the classification performance by making full use of the generalization of unsupervised learning and the accuracy of supervised learning.In this paper,we compare CVIFLR and the state-of-the-art imbalanced learning methods on the data set of prenatal screening in Jilin Province.Experiment results show that CVIFLR outperforms other methods for highly unbalanced medical data,and we can train a local model with CVIFLR to improve the quality of prenatal screening in Jilin Province.For the study of Few-shot Learning,this paper firstly compares and analyzes the Metric-Based methods.Then,in order to solve the problems of shallow feature extraction network and insufficient feature extraction ability in existing methods,we use the Dense Layer structure to design a deeper Prototypical Network.The new structure can improve the transfer efficiency of characteristics and gradients,and overcome the over fitting problem.Furthermore,we use adversarial learning method to train the deeper Prototypical Network to improve the generalization ability and the accuracy of the network.The experiment results show that the CVIFLR algorithm proposed in this paper has a significant effect on the classification of highly imbalanced case data.On the data set of prenatal screening of Down's syndrome in Jilin Province,with the proportion of negative:positive=10244:108,CVIFLR outperforms state-of-the-art imbalanced learning methods.By adjusting parameters,the AUROC of CVIFLR achieves 0.99.We also prove that the improved Prototypical Network(Dense P-Net and GAN-DPN)proposed in this paper can extract the generalized category features from a small number of sample images,and their recognition accuracy of few-shot learning on miniImagenet dataset are higher than that of the original Prototype Network.The proposed algorithms in this paper provide effective methods for the classification of highly imbalanced data and recognition of small-sample images.They have important application value in medicine,military,industry and other fields.
Keywords/Search Tags:Wise Medical, Imbalanced Data, Cascade Learning, Few-shot Learning, Adversarial Learning
PDF Full Text Request
Related items