Font Size: a A A

Biased Labeling In Crowdsourcing Systems

Posted on:2016-07-08Degree:DoctorType:Dissertation
Country:ChinaCandidate:J ZhangFull Text:PDF
GTID:1108330473961648Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the emergence of crowdsourcing systems, such as Amazon Mechanical Turk, many tasks that cannot be tackled by machine intelligence can now be posted to these systems and manually completed by online users via a micro outsourcing manner. Machine learning and data mining communities obtain the benifits from the crowdsouring, where a lot of traditional time comsuming and highly costly labeling tasks that used to be conducted by human experts are now transited to crowdsourced labeling, which accelerates the data updating and model refinement. However, without a guarantee of labeling qualities of the crowdsourced annotators, machine learning with the crowdsourced labeled data faces greate challenges. Thus, it is valuable to study machine learning algorithms which ultilize multiple noisy labels. This dissertation starts with the biased labeling problem, and its main research contributions are as follows.(1) We discuss data quality and model quality issues in crowdsouring systems as well as the definition of the biased labeling problem. We theoretically analyze the impact of baised labeling on the majority voting strategy that is commonly used in many crowdsourcing systems. Then, we analyze a great number of real-world data sets to reveal the existence of biased labeling and its causes. At last, we conduct a set of experiments on the data sets with biased labeling and investigate the performance of EM-based ground truth inference algorithms. Our findings confirm that biased labeling indeed deteriorates the performance of the EM-based ground truth inference algorithms.(2) For binary labeling, we propose a novel algorithm named positive label frequency threshold (PLAT) which can automatically estimate the decision threshold. The algorithm only depends on positive and negative labels in the multiple noisy label set of each example without the necessity of any prior knowledge such as the qualities of labelers, underlying class distributions and the level of bias. It can automatically estimate the threshold that divides the examples into positive and negative. Experimental results on both synthetic and real-world data sets show that PLAT not only infers the integrated labels and builds a high quality learning model under biased labeling circumstances, but also works well in comparison with other state-of-the-art algorithms under non-biased labeling circumstances.(3) Labeling using crowdsourcing is a dynamic process that follows the perspective of active learning. We propose a novel active learning framework with biased labeling. The framework includes two core procedures:ground truth inference and instance selection. During the ground truth inference, the PLAT algorithm is used to infer integrated labels of examples with multiple noisy label sets. During the instance selection, we propose three strategies based on uncertainty measures to improve the learning performance. These three uncertainty measures are based on multiple noisy labels and the level of bias (MLSI), on the currently learned model and the level of bias (CMPI), and on hybrid strategies (CFI). Experimental results on both synthetic and real-world data sets with different underlying class distributions show that the CFI strategy has the best performance.(4) Under multi-class labeling circumstances, we propose a novel algorithm GTIC which can fuzzily handle the biased labeling problem. The algorithm can improve the performance of inferring integrated labels form multiple uncertainty labelers. For the K-class labeling problem, GTIC generates concept-level features from the multiple noisy label set of each example. Then, all examples are clustered into K groups using the classic K-Means algorithm. Each group is mapped into a class and all examples in this group are assigned with the same class label. Experimental results on low quality data sets show that GTIC is superior to the state-of-the-art algorithms both in accuracy and M-AUC metrics. Besides, GTIC is about ten times faster than the MLE-and-EM-based ground truth inference algorithms. GTIC is also easy to parallelize and suitable for big data applications.
Keywords/Search Tags:Multiple Noisy Labels, Biased Labeling, Ground Truth Inference, Learning from Crowds, Active Learning, Supervised Learning
PDF Full Text Request
Related items