Font Size: a A A

Multi-Label Active Learning With Hierarchical Label Structure

Posted on:2020-01-30Degree:MasterType:Thesis
Country:ChinaCandidate:Y F YanFull Text:PDF
GTID:2428330590472666Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The performance of traditional supervised model relies on a large amount of labeled data.However,in many real-world tasks,the annotation cost of labeling data is hard and time-consuming.Thus it is very important to train an effective model with less labeled data.Active learning is the most important approach to solve this problem.Specifically,it iteratively selects the most valuable instances from unlabeled data,queries the oracle for supervised information,and adds it to the training data set to improve the model performance.For multi-label learning tasks,each instance may have multiple labels simultaneously,which leads to more annotation cost,so the benefits of active learning are more significant.In some multi-label learning tasks,label set can be explicitly organized into a tree structure from coarse to fine,and labels located at deeper layers contain more detailed information,also are expected to have more annotation cost.Existing active learning algorithms mostly consider label information itself,ignoring the information correlation between labels in hierarchical structure.What s more,these approaches reduce the cost of labeling by reducing the number of labels,which may lead to low queries but high annotation cost.In this paper,we exploit the label hierarchies for multi-label active learning,and consider the different annotation cost in label hierarchical structure,our main contributions include:1.we propose a novel approach HALC for multi-label active learning based label hierarchy.Firstly,we propose a new criterion of estimating the informativeness,which considers the poten-tial contributions of ancestor and descendant labels in the label hierarchy.Secondly,two active strategies based on bi-objective optimization problem and knapsack problem are used to balance the conflict between the annotation cost and the informativeness.Experiments demonstrate that HALC can effectively reduce the annotation cost and improve the model performance significantly.2.we propose a novel approach ALCAL for multi-label active learning by adaptly learning the label correlation.We measure the informativeness of instance-label pair by learning the label correlation,instead of formulating the information criterion heuristically.We balance the conflict between the informativeness and annotation cost by integrating the informativeness and active strategy into unified framework.Experiments demonstrate that ALCAL can effectively learn the correlation between labels in the hierarchical structure,and reduce the annotation cost.
Keywords/Search Tags:Machine Learning, Active Learning, Multi-Label Learning, Hierarchical Label Structure, Hierarchical Classification, Cost Sensitive
PDF Full Text Request
Related items