Font Size: a A A

Improving Crowdsourcing Data In Human Computation Powered Systems And Its Related Applications

Posted on:2018-06-10Degree:DoctorType:Dissertation
Country:ChinaCandidate:H M RaoFull Text:PDF
GTID:1318330542490545Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Big data and artificial intelligence have been almost involving every aspect of our daily life nowadays,especially due to the development of deep learning technique,as well as the advances of powerful computation by GPU.On one hand,while a wide range of applications have achieved significant progress under the light of that,such as image classifications,speech recognition and self driving cars,there are still many tasks that are difficult to automate but relatively easy to be handled by humans,or it still lacks enough datasets to train a qualified deep network.On the other hand,crowdsourcing has recently emerged as a powerful alternative solution to either serve as means for large data collection to train models,or as human-powered computation to handle complex semantic tasks.But one of the biggest challenges with crowdsourcing is its unreliable crowd work-ers,which leads to poor quality of collected data or incorrect output of processing.That is why quality control is one of the hottest topics in crowdsourcing.The exist-ing approaches of quality control in crowdsourcing can be broadly classified into two categories:design-time and run-time.At design time,based on human factor theories and other rules,tasks should be well defined and designed to allow a suitable crowd to contribute high-quality data;At run time,when the task is running as well as when the crowd contributions are being collected,different techniques are applied to analyze,filter and aggregate the data to build the final task answer.In terms of the research field of HCI(Human-Computer Interaction),this thesis is putting itself on the former by designing user experiments and studying how different reward schemes and design strategies could impact the data distribution from crowds in the task of schematizing pictures.The reason of choosing the task to illustrate our experiments is that,on one hand,the task itself is a hot topic in the filed of scene understanding,on the other hand,this task has an objective answer and requires identification and integration of spatial cues,which makes it unique compared to other crowdsourcing tasks.The contributions of this thesis are summarized as below:(1)We introduced a prototype system to show how human computations can be utilized to generate schematic maps from a set of pictures,without making strong assumptions or demanding extra devices.The system required humans(crowd workers from Amazon Mechanical Turks)to do simple spatial mapping tasks in various condi-tions,and their data were aggregated by filtering and clustering techniques that allow salient cues to be identified in the pictures and their spatial relations to be inferred and projected on a 2D map.It consists of two main components:One component is for object identification from the rectangles drawn in the image,and another component is for object position extraction from the labels mapping on the 2D grid.The output of the system is the information of the scene structures represented in the input picture,indicating not only what kinds of objects exist in the scene but also their 2D spatial relationship.We will provide further details of the two components in the rest of this section.(2)We run a study using Amazon Mechanical Turk to investigate how crowd work-ers can perform a spatial location identification task in two kinds of reward schemes:ground truth and majority vote,even when they were not familiar with the environ-ment.In this task,crowd workers were presented with a camera view of a location,and were asked to identify the location on a two-dimensional map.In the "ground truth" scheme,workers were rewarded if their answers were close enough to the cor-rect locations;In the ?majority vote" scheme,workers were told that they would be rewarded if their answers were similar to the majority of other workers.Results showed that the majority vote reward scheme led to consistently more accurate answers.Clus-ter analysis further showed that the majority vote reward scheme led to answers with higher reliability(a higher percentage of answers in the correct clusters)and precision(a smaller average distance to the cluster centers).This study has clearly showed that reward schemes can make remarkable differences while designing crowdsourcing task and therefore should be carefully crafted;Compared to the ground truth scheme,the majority vote scheme can lead to better results in general cases.(3)We demonstrated the main challenges while designing the above system and proposed solutions for that.In particular,we tested and demonstrated the effective-ness of two methods that improved the quality of the generated schematic map:1)We encouraged humans to adopt an allocentric representations of salient objects by guid-ing them to perform mental rotations of these objects;and 2)We sensitized human perception by guided arrows superimposed on the imagery to improve the accuracy of depth and width estimation.We demonstrated the feasibility of our system by evaluat-ing the results of schematic maps generated from indoor pictures taken from an office building.By calculating Riemannian shape distances between the generated maps to the ground truth,we found that the generated schematic maps captured the spatial relations well.The results does not only show that the combination of human com-putations and machine clustering could lead to more accurate schematized maps from imagery,but also suggested that certain levels of desirable difficulty can make human computation systems more robust and accurate,as the extra difficulty will implicitly guide human cognition to work harder in ways that generate better data.(4)Based on the above findings,we proposed a general framework to incorporate human computations into collaborative mobile indoor navigation assistance systems.Since crowd workers have been proved to be capable to efficiently handle spatial se-mantic information under a dedicated condition,it provides a cost-effective alternative to effectively transfer what the user is seeing to a remote crowd worker,such that inter-active assistance,like augmented reality techniques,can be provided to the local user.One unique and critical challenge for the system is to support communication of spatial information,as people at remote locations cannot anchor their conversation by directly referencing objects in the same spatial environment.Following the general framework,we implemented a prototype system by combining schematic representations and aug-mented reality tools to help users establish anchors that allow them to both see and refer to the same locations,such that the users can develop a richer representation of the spatial environment and navigate in an unfamiliar indoor environment.We fo-cused on the task of indoor navigation assistance to highlight this challenge-and built a prototype system that allowed two or more persons at remote locations to better communicate and make spatial inferences.
Keywords/Search Tags:crowdsourcing quality control, human computation, picture schematization, remote collaborative system
PDF Full Text Request
Related items