Font Size: a A A

The Design And Implementation Of Data Collection And Audit System Based On Crowdsourcing Mode

Posted on:2021-02-09Degree:MasterType:Thesis
Country:ChinaCandidate:Y H JiaoFull Text:PDF
GTID:2518306047984749Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
In recent years,the development of the Internet industry has brought about a huge demand for data,which forces the industry to develop massively with a fast speed.Data collection is the main business of a data provider.the data demander proposes a data order with specific needs,and then the provider sort out structured data that meets the demand through data collecting.However,due to the short history,low investment and disparate development in this industry.There exists some problems such as hard to ensure data quality,the costs remain high,the duration is too long and couldn't meet the needs of the market as well as the data security problem.This paper introduces a data collection and audit system based on crowdsourcing mode for the above problems.Firstly,the system distributes the traditional data collection and quality assurance work of the data provider to the network users through the way of the crowdsourcing platform tasks to achieve cost control.Secondly,the system solves the problem of quality assurance of collected data while reducing costs through statistical methods such as sample question filtering,answer fitting and answer sampling.Finally,the system implements a PC client.On the one hand,it can make full use of the machine resources of different customers to process the collected data in parallel,and cooperate with the crowdsourcing platform to achieve work distribution,thereby shorten the collection duration and improve the business throughput.On the other hand,the PC client uses digital watermarking technology to store the important business information in the original data without perception,and implements the responsibility traceability function to meet the information security requirements of the relevant laws on the data collection industry.In terms of technology selection,this system uses mainstream technology architecture.On the front end,it used mature and stable open source Angular2 framework is selected.The server is based on the LAMP architecture and uses the PHP language environment to support the MVC and OOP Yii frameworks to improve development efficiency and focus on business logic.In order to improve system performance and reduce database read and write pressure,the system uses Redis as a cache to reduce unnecessary SQL operations and uses Baidu cloud storage service BOS to achieve the actual storage of massive data.The prototype of the crowdsourcing data collection and review system introduced in this article is a subsystem of Baidu crowdsourcing platform,which implements the core functions related to the data collection business.At present,Baidu crowdsourcing platform has been running smoothly for many years,and has gradually become a benchmark enterprise in the data collection and data annotation industry.It has so many successful cases in the data industry,and has explored a set of solutions that have been widely recognized by the industry in terms of data deliveries,collection period,data quality,and cost control.
Keywords/Search Tags:Crowdsourcing Mode, Data Collection, Yii Framework, Data Audit
PDF Full Text Request
Related items