Font Size: a A A

Research On Data Supply Chain Model And Quality Of Service Guarantee

Posted on:2019-07-09Degree:DoctorType:Dissertation
Country:ChinaCandidate:P LiFull Text:PDF
GTID:1318330542495356Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the advent of the big data era,large amounts of open data platforms have been arised.These platforms enable data to flow freely.In this case,a data supply chain is constructed among different users.Data supply chains can provide various data services for users.Besides,it can help break down the data silos and deepen co-operation between enterprises.As a result,data will generate greater value.However,the process of data flow in the platform or between platforms is complicated.So it is very necessary to make the data flow easy to understand before exploiting data supply chains.Besides,users are faced with a massive set of candidate data supply chains with equivalent or similar functions but different QoS in their selection process.Since the performance of most candidate data supply chains is unknown to a user,it is different for users to select the most suitable service.Moreover,node data in a data supply chain has the characteristics of multi-sources,dynamics,and mass.While executing application of a data supply chain,the characteristics may lead to low execution efficiency or node failure of a data supply chain.To solve the above issues,we propose a serious of novel models and methods.For the problem of constructing data supply chain model in a multi-platform environment,an information model for data supply chains is designed,which refers and improves a data provenance specification PROV presented by W3C.Based on that,we present a node generation algorithm for data supply chains.It makes each data platform could record data moving information in accordance with unified specification.Furthermore,for the efficiency optimization problem of constructing complex data supply chain model,a hierarchical management strategy for data supply chains based on summarization technology is proposed.By merging intermediate versions,the node records would be took a multilayer division and stored optimally.Besides,we present a multi-level data supply chain query algorithm.To achieve a hierarchical construction for data supply chains,according to the user's requests,a layered query mechanism is used to get query results from related data platforms,and then these results would be merged and sorted.For the QoS prediction problem of data supply chains,a QoS prediction model for data supply chains is proposed.First,a multi-dimensional QoS model with properties is proposed,which captures response time,execution cost,accuracy and data freshness.Based on that,we focus on basic composite structures and formulate the QoS aggregation functions for data supply chains.Furthermore,the query-driven execution paradigm and period-driven execution paradigm of data supply chains are researched.Then,we present a context-based QoS prediction method for data supply chains,which exploits the users'contexts to obtain subchains which are need to execute when users query occurs and leverage them to predict QoS values with a high precision.For the node operating modes combination optimization problem of data supply chains,a node operating mode optimization strategy is proposed.First,QoS of data supply chains being highly related to operating mode of nodes is considered.Then,we put forward an objective function to balance the data freshness,hit rate and cost from the aspects of users and service providers.To solve the objective function,we present a novel node operating mode combination optimization algorithm based on an enhanced ant colony system.It selects appropriate candidate operating modes for each node to generate composite operating mode of nodes with optimal QoS values under the QoS restricted condition.For the substitution problem of the failure nodes in a data supply chain,a method for data supply chain comparability determination is proposed.First,the semantic similarity between name and text description of the nodes is used to find the key points.Based on it,the original data supply chains can be divided into some subchains.In this case,we proposed a feature space representation model,which can extract the key features from the subchains and simplify it into a feature vector form.Then,we formulate the similarity computation of the subchains based on the multiscale features.Further,a hierarchical clustering algorithm is proposed to separate the subchains into disjoint groups;thus,the cluster containing the query object is the similarity search results.In the process of selecting optimal subchains,for these similarity results,a subchain recommendation algorithm based on optimization selection is presented.For each subchain,we can measure the overall QoS of a data supply chain if the target node is replaced by this subchain.Moreover,based on the QoS computational results,the optimization subchains can be recommended to user.In summary,in order to further verify the effectiveness of the proposed methods and models in real environments,we design and implerment a prototype system of data management platform to test key technologies.Then,the ideal results are achieved and the performance of the proposed methods is analyzed.
Keywords/Search Tags:data supply chain, model construction, QoS prediction, QoS optimization, node substitution
PDF Full Text Request
Related items