Font Size: a A A

Self-paced Learning Based Quality Control Model For Crowdsourcing Classification Data

Posted on:2018-01-15Degree:MasterType:Thesis
Country:ChinaCandidate:H ShiFull Text:PDF
GTID:2359330536460873Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Crowdsourcing(also known as Human Computation,Citizen Wisdom)represents the act of a institution which take a function once performed by employees and outsourcing it to an undefined(and generally large)network of people in the form of an open call.Crowdsourcing becomes increasingly popular in recent years,with the belief that the wisdom of the crowd is superior to the judgements of individuals.Crowdsourcing platforms,such as Amazon Mechanical Turk and Crowd Flower,distribute tasks to workers that are paid for their answers.Crowdsourcing classification quality control plays an essential role in such systems.Majority voting(MV)is a direct solution to this problem in a heuristic way.However,MV fails to take into account the reliabilities of different workers and difficulties of objects.To overcome this problem,many researchers proposed models which emphasizes the differences of workers and objects.However,existing algorithms which assumed the priority of data samples are equal,are prone to get stuck in a bad local optimum because of ill-defined real world datasets.In this thesis,we propose a novel self-paced probabilistic model.The proposed model integrates a priority-based sample picking strategy with GLAD model to determine easy samples that are learnt firstly.We also define the conception easiness of crowdsourcing data samples and propose a method to get proper prior distributions of parameters based on both knowledge from dataset and model.We explain our models as a probabilistic graph model and illustrate that an effective approximation of generative models.We also empirically demonstrate that the proposed self-paced learning strategy promotes common quality control methods.
Keywords/Search Tags:Quality control, Crowdsourcing, Self-paced learning, Probabilistic graphical model
PDF Full Text Request
Related items