| In recent years,artificial intelligence technologies,have advanced rapidly and been integrated into daily life.Traditional supervised machine learning requires sufficiently annotated data for training,hence limiting the applicability of many machine learning methods.In reality,there are numerous and diverse insufficiently annotated data.Therefore,designing machine learning methods to take advantage of these insufficiently annotated data is of great research value.This study focuses on various forms of insufficiently annotated data,their data characteristics,and the limitations of existing methods.Specifically,the following works are carried out:For scenarios where labels are given to bags of samples instead of individual samples,we model them as multiple instance learning problems.In image segmentation,we propose an end-to-end neural network model,Attention U-net,to address the low segmentation resolution and poor performance of supervised semantic segmentation models.The Attention U-net consists of two modules: an upsampling-concatenating-convolution structure and an attention pooling layer.We design experiments on chest X-rays and conventional electrocardiograms to validate the model’s performance and utility of the model for images and temporal signals.For scenarios where only negative sample labels are given,we model them as anomaly detection problems.For the detection of fine-grained image defects,we propose the De STSeg model,which consists of a denoising autoencoder student network,a teacher network,and an anomalous segmentation network for adaptive feature fusion.The model can accurately segment abnormal regions while detecting abnormal samples,which addresses the problem of unstable anomaly detection results and ambiguous anomaly localization in current knowledge distillation-based anomaly detection methods.Our method can improve anomaly detection performance at the image,pixel,and instance levels on industrial inspection data.For scenarios where human-machine interactive labeling is involved,we model them as active learning problems in an effort to reduce manual labeling while achieving a certain model accuracy.It has been shown that the active selection strategy is not as efficient as the simple random selection strategy at a specific time.We propose the adaptive active-selection-random-selection(ASRS)algorithm,which adaptively selects the active or random selection strategy according to the model’s state at a specific time,combining the advantages of both strategies to achieve better performance in the context of interactive labeling with few samples.ASRS can significantly reduce manual labeling in the semi-automatic diagnosis of ambulatory electrocardiograms.In summary,this study proposes corresponding machine learning methods from various scenarios of insufficiently annotated data,improves the limitations of existing methods,and thoroughly validates the effectiveness of the proposed methods in related application scenarios,thereby facilitating the subsequent development of related research. |