Learning From Crowds

Posted on:2015-07-04

Degree:Master

Type:Thesis

Country:China

Candidate:Z Q Liu

Full Text:PDF

GTID:2298330452964011

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Crowdsourcing is to solve a problem in a distributed manner. Tasksare distributed to a large group of people from an online community. Ingeneral, a large amount of labels are needed for supervised learning al-gorithms to achieve satisfactory performance. Recently, crowdsourcingservices provide an efective way to collect labeled data with much lowercost. It is cheap and time-saving to collect labels using crowdsurcing ser-vice. However, quality problem arises in those collected labels. Usually,repeated labeling is adopted to collect multiple labels for each instance.We focus on three problems in this paper. The frst problem is how totrain an accurate classifer using noisy labels. We propose a robust person-al classifer. Our proposed model can estimate an expertise score for eachlabeler and learns a classifer at the same time. The second one is how toestimate the missing labels. In the real world, each annotator does not la-bel all the data instances and each instance is not labeled by all annotators.We propose an algorithm to estimate the missing labels. The algorithm issimilar to collaborative fltering algorithms. We estimate the labels usingthe labels which are given by the same labeler to some similar instances.The third problem is to combine active learning with crowdsourced learn-ing. The key problem lies in how to choose a proper annotator and choosea proper instance. Experiments on synthetic and real data demonstrate thatour algorithms achieve better performance than baseline algorithms.

Keywords/Search Tags:

Machine Learning, Crowdsourcing, SupervisedLearning

PDF Full Text Request

Related items

1	Research On PDF Structure Parsing Based On Machine Learning And Crowdsourcing
2	Multi-Label Crowdsourcing Learning
3	Research On Extraction Technology Of Relation Between Enterprise Entities Based On Machine Learning
4	Utterance Labelling Crowdsourcing Platform Design And Implementation
5	Research On Machine Learning Methods That Exploit Unlabeled Data
6	Label Aggregation In Crowdsourcing
7	Knowledge Fusion Based On Machine Learning Model And Crowdsourcing
8	Active Learning and Crowdsourcing for Machine Translation in Low Resource Scenarios
9	Human-Machine Synergistic Learning
10	Research On The Algorithm Improving The Quality Of Crowdsourcing Data Labeling