Font Size: a A A

A Method And Optimization Strategy For Publishing Questions In Schema Matching Via Crowdsourcing

Posted on:2017-01-08Degree:MasterType:Thesis
Country:ChinaCandidate:K XuFull Text:PDF
GTID:2308330509456417Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The rapid development of information technology and network, make the different application field cooperation more and more frequent, and the data interoperability more and more important. Due to the high degree of autonomy of the mode of production in the field,lead to the data model heterogeneity. To achieve the sharing and use of the data, we should use data integration work to solve the current situation of the "information island". As the fundamental problem in data integration research, schema matching get the attention of the academia,and proposed many matching method, and has developed a variety of schema matching tool. Using schema matching tool can greatly improve the efficiency of pattern matching, but the result is given with uncertainty, and this uncertainty is difficult to through optimizing pattern matching approach to eliminate. Human wisdom and life experience can help to reduce the uncertainty.With the introduction of the crowdsourcing, this new Problem Solving Process has been widely applied in many areas. In database field, with the method of crowdsourcing to help solve schema matching tool in the process of matching the uncertainty problem become a research hotspot.In this paper, based on the use of crowdsourcing to solve the problem on the schema matching, we have studied crowdsourcing contracting processes and come up with crowdsourcing contracting methods and optimization strategies. The methods and strategies provided by this paper can save pattern pattern pattern costs and time. In this paper, there are two parts, as follows:Based on EENTROPYand BETA distribution, this paper presents a method called Entropy-Beta for schema matching problem in crowdsourcing distribution. The method introduces the concept of entropy and measures the uncertainty of the schema matching, which help to choose the best question to publish according to each question with respect to the size of the result set schema matching entropy, so every publishing problem can minimize the uncertainty in schema matching tool to improve the efficiency. Meanwhile, Beta method provides the accuracy of answersand adjusted the order of publishing problems according to the results of the calculated to ensure accuracy of solving the problem. Based on the Entropy-Beta method, this paper introduces marginal principle in economics and proposed optimization strategies in match crowdsourcing method for contracting mode: MarP.The strategy considers the uncertainty of schema matching accuracyandthe cost of publishing questions. According to the marginal principle, the order of the post questions has been optimized and adjusted to satisfy the principle of diminishing marginal returns. The second, during the process of contracting, the costs and method of reducing uncertainty were given. And depending on the judging conditions, we stop publishing the problems to ensure the best certainty and the lowest cost. Finally, we use four sets of experimental data to solve the efficiency, certainty and cost of pattern matching problems, through simulation experiments and recruiting volunteers. Compared experimental results, it can be verified that the contracting methods and optimization strategies provided by this paper can satisfy the publisher. In this paper, there are two innovations, as follows:1、According to the order for schema matching crowdsourcing contract issues, we presents a method, based on entropy to sort the problems of publishing. This method can select the problem which give the greatest contribution for schema matching to publish to improve the efficiency of solving the problem.2、During process of publishing problems, we proposed a strategy based on marginal principle. According to judging conditions of stop publishing problems, the strategy can ensure that the publisher can obtained maximum benefits at the minimum cost.
Keywords/Search Tags:crowdsourcing, schema matching, Entropy, Beta, marginal revenue, marginal point
PDF Full Text Request
Related items