| Deep learning models have shown excellent performance in an increasing number of scenarios due to their strong learning ability.However,training these models requires a large amount of annotated data,which incurs substantial costs,posing a significant challenge to the efficiency of algorithms.The purpose of active learning is to train highperformance models using only a small portion of annotated data by labeling only the most representative samples in the dataset.At present,most mainstream active learning algorithms extract features using convolutional neural networks and sample and label the unlabeled data according to the output of the feature space.However,this does not take into account the erroneous labeling or class imbalance that exists in many active learning scenarios,leading to interference from many noisy dimensions in the final feature space,which affects sampling quality.At the same time,current mainstream active learning algorithms use uncertainty-based sampling strategies during the sampling phase,which often leads to severe sampling bias in the context of deep learning batch training sampling.In this thesis,a novel active learning training method based on hidden space representation is proposed.Through the β-VAE model,data is mapped and reduced to the distribution of the hidden space,which can effectively alleviate the interference of noisy dimensions.On this basis,the attention mechanism is integrated into the model,avoiding the bypass phenomena caused by directly introducing the attention mechanism into theβ-VAE,which helps model extract important features of the data better.Finally,to alleviate sampling bias effectively during the sampling phase,a mixed sampling strategy based on Wasserstein distance measurement is proposed.Compared with the uncertainty-based sampling strategy,the proposed strategy can achieve a better performance.This thesis conducts a large number of comparative experiments on various common datasets to explore the feasibility and scalability of the algorithm.Compared with current mainstream active learning methods,the proposed active learning training method based on hidden space representation achieves stable performance improvement on various datasets.Through ablation experiments on the three modules proposed in this thesis,the importance and effectiveness of each module in this method are further verified. |