| The performance of machine learning models depends on the number of labeled samples in the training set.However,it is expensive to label samples in many application fields.The core idea of active learning is to select the samples which are the most worthy of annotation in the unlabeled sample set,rather than all sample,according to some sample selection strategies,so as to reduce the cost of annotation.The current active learning technology has the following limitations:(1)The sample selection strategy is often designed based on experience or intuition,which is heuristic.(2)Experts are required to manually design specific sample selection strategies for specific tasks and models,and the designed strategies are not general.(3)Conventional active learning regards the iterative sample selection process as multiple single-step simple decisions,and does not consider the correlation between multiple decisions.Therefore,in view of the above shortcomings of active learning,we propose an active learning method based on reinforcement learning;and use the idea of domain adaptation to solve the problem of requiring a large number of labeled samples to train reinforcement learning networks,and propose a domain adaptive automated active learning framework.This method has both good active learning performance and generality to a variety of tasks and models.The research content of this paper mainly includes the following two parts:(1)In view of the shortcomings of the conventional active learning technology,we propose a novel active learning method based on reinforcement learning named Reinforcement Active Learning(RAL).This method models the sample selection process in active learning as a reinforcement learning decision-making process,using the idea of Deep Q Learning in reinforcement learning to train an intelligent agent that can make decisions whether to select the samples or not in the specific tasks.This method connects the reinforcement learning system with the recognition model.After the agent makes a decision to select the sample,it optimizes the agent’s strategy by taking the change in accuracy brought by the model as the feedback of the action.This method makes the heuristic sample selection strategy intelligent in active learning,and automates the process of experts manually designing sample selection strategies.The experimental results show that,compared with several commonly used sample selection strategies,the active learning method based on reinforcement learning proposed in this paper achieves the best performance on the three datasets and the corresponding recognition models,and can save the cost of labeling to the greatest extent.(2)In order to solve the shortcomings of the active learning method based on reinforcement learning(RAL)proposed in this paper,which requires labeled samples to train the reinforcement learning network,we propose an Automated Active Learning framework(Auto AL)using the idea of deep domain confusion in domain adaptation.In this method,we regard the labeled public dataset as the source domain data,and the unlabeled sample set to be selected as the target domain data,and an adaptation layer is added to the reinforcement learning network to calculate the MMD distance between the two domains.This method minimizes the domain loss while optimizing the reinforcement learning loss,and fuses the two domains in depth,so that the reinforcement learning agent trained with the source domain data can also be used for sample selection of the target domain data.The experimental results on three sets of transfer learning datasets verify the effectiveness of using the idea of domain adaptation in reinforcement learning,that is,the automated active learning framework proposed in this paper can train reinforcement learning networks without using labeled target domain samples,further reducing the labeling cost of target domain data. |