Font Size: a A A

Research On Domain And Plagiarism Perception Crowdsourcing Model Based On Task Difficulty Differentiation

Posted on:2022-11-11Degree:MasterType:Thesis
Country:ChinaCandidate:S X SiFull Text:PDF
GTID:2518306779470104Subject:Automation Technology
Abstract/Summary:PDF Full Text Request
Nowadays,with the development of information technology,crowdsourcing is more and more widely used.In crowdsourcing,workers are employed to brainstorm ideas to solve some problems that are difficult to be accurately solved by computers in the traditional sense.Crowdsourcing involves two interrelated processes: task assignment and truth discovery.Task assignment refers to selecting appropriate tasks to assign to appropriate workers.Truth discovery refers to solving the noises and errors in workers' answers and inferring the correct answer of each task from the conflicting answers.With more and more research on crowdsourcing,how to effectively improve the accuracy of crowdsourcing results and reduce costs become the main research problem.However,most of the existing algorithms ignore the fact that "people have their own strengths" and do not consider the differences in worker accuracy across domains.In addition,most algorithms default that workers complete tasks independently,ignoring the possible copying relationship between workers.Finally,the existing crowdsourcing platforms ignore the difference of task difficulty and set the number of workers required for each task to a fixed value or no upper limit,resulting in cost waste.To solve the above problems,this paper proposes two crowdsourcing models,which not only improve the accuracy of crowdsourcing results,but also reduce the time and labor cost.Specifically,the main research contents of this paper are as follows:Firstly,considering the influence of domain and copiers,a domain aware model considering copiers in crowdsourcing,i.e.,DAC,is proposed.The model uses a vector to model the credibility of each worker.For multi-domain tasks,by calculating fine-grained worker credibility,a greedy algorithm is used to assign tasks to domain experts.In order to avoid the task being assigned to the copiers,Bayesian method is used to detect and remove the copiers.DAC consists of task assignment phase and truth discovery phase.In the task assignment phase,worker selection and copier detection are carried out in an iterative manner.In the truth discovery phase,the truth and worker credibility are calculated by iterative method,so as to update the worker domain vector.Secondly,an improved model SWWC for distinguishing task difficulty is proposed on the basis of DAC.For tasks with different difficulties,the number of workers required to obtain accurate crowdsourcing results is different.Before the task assignment phase,SWWC quantifies the task difficulty through entropy uncertainty,and then determines the number of workers required accordingly,so as to ensure the accuracy of the results and avoid the waste of labor cost.In addition,in the improved model,a method of initializing worker's domain vector by using label vector is also proposed to make the accuracy of new worker's domain vector closer to its real level.In this paper,experiments are conducted on two real-world data sets and a synthetic data set to compare the performance of DAC,SWWC and seven representative baseline methods in terms of accuracy and efficiency.Experimental results show that DAC and SWWC outperform other algorithms.Finally,this paper designs and develops a crowdsourcing visualization system,which applies the SWWC model to task assignment and truth discovery.It mainly realizes the functions of user management,requester publishing and viewing tasks,workers completing tasks and viewing personal information,and administrators viewing information.
Keywords/Search Tags:crowdsourcing, task assignment, truth discovery, copier, task difficulty
PDF Full Text Request
Related items