Font Size: a A A

Modeling And Management For Data-intensive Services

Posted on:2018-12-10Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y Z HuangFull Text:PDF
GTID:1318330545458218Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid growth of information thechonlogy,the amount of data brought by information technology has increased dramatically.Data-intensive services which should deal with large amount of data to complete the complex business goal emerged accordingly.Recently,data-intensive services have been applied in several aspects,and aroused concern from the researchers both in academia and industry.Data-intensive services should aquire,update and store numerous data during the execution of the services.The data-intensive services have three characteristics,in specific,(1)large amount of data,both structured and unstructured,should be processed,analyzed and stored for data-intensive services;(2)data and services are geographically dispersed in different locations;(3)the large amout data which are manipulated by services are not independent,since they always interact with each other during the execution of the services.Therefore,with such characterisitcs,the service and data should be carefully studied accordingly.This paper studies the approaches of modeling and management for data-intensive services from new single.Specifically,the constributions of this paper are four-fold as follows.1?Traditional service markup focus on the decription of service function,lacks of the abilities to describe the data which are manipulated by services.Another limitation of traditional service markup is binding the service with original physical location.To fill these gaps,this paper designs an ontology-based semantic service markup for data-intensive services.Two components which are Service Indendifier(SID)and Service Behavior Description(SBD)are presented to modeling the data-intensive services.A hybrid service naming scheme is designed for SID,which is designd to uniquely indentify each service.Furthermore,service behavbior and attributes can be specified by SBD with fine-grained semantic description.Finally,detailed designs are introduced,and a case study of a SOA-based online multimedia conference system is presented to validate the effectiveness of our approach.Our scheme of semantic service markup can be applicable to decouple the service entity with its original physical location,and support the service discovery and the mobility of the services.2?Traditional process-centric model for business process meets challenges due to the lack of ablilites to describte data semantics and dependencies,resulting the imcompleness of data-intensive composition services and the inflexibility of the design and implement for the data-intensive business processes.To fill the gap,this paper proposes a novel data-aware business process model which is able to describe both explicit control flow and implicit data flow.Data model with dependencies which are formulated by Linear-time Temoral Logic(LTL)is presented,and their satisfiability is validated by an automaton-based model checking algorithm.Data dependencies are fully considered in modeling phase,which helps to improve the efficiency and reliability of programming during developing phase.Finally,a prototype system based on jBPM for data-aware workflow is designed using such model,and has been deployed to Beijing Kingfore heating management system to validate the flexibility,efficacy and convenience of our approach.The results shows that our approach can improve the automaticity of code generation and the adaptively of process execution,which is helpful in massive coding and large-scale system management in reality.3?With the evolution of cloud computing,some organizations deployed workflows into the cloud environment.In order to manage the large volume data efficently,the data in business processes should be treated differently according to its importance and involved activities.Hence,how to distinguish the data entities from plenty of process data is an urgent problem.To attack this challenge,this paper presents a frequent pattern based approach of interacting data entity discovery for cloud workflows.A direct discriminative mining algorithm is proposed to determine the minimum support threshold,based on which FP-tree is constructed for formulating the frequent item pairs.FP-matrix technique is applied to avoid traversing the FP-trees during data entity discovery,and a pruning approach is designed for reducing the redundancy of frequent item pairs.Furthermore,the data entity mining algorithm is parallelized by MapReduce framework,and then a primitive data placement and backup strategy is designed to investigate the benefits of our work.Finally,the efficacy of our approach is validated by real-life data based experiments.This approach is expected to discover the interacting data entities with efficiency for cloud workflows,and provide foundamental support for data placement and data backup for cloud workflows.4?With the development of data-intensive services,the number of data-intensive services has been increasing rapidly.Hence,the Quality of Service(QoS)of data-intensive services should be managed efficiently.The traditional approach is by managing and rating all the services at the user side and ranking the services according to the numerical QoS value's.Due to the characteristics of data-intensive services,it is difficult to evaluate all services at a single client,as it is a time-consuming and resource-consuming process.To attack this chanllenge,this paper proposes a time-aware QoS ranking prediction approach named TSRPred for obtaining the global ranking from the collection of partial rankings.Specifically,a pairwise comparison model is constructed to describe the relationships between different services,where the partial rankings are obtained by time series forecasting on QoS values.The comparisons of data-intensive services are formulated by random walks,and thus,the global ranking can be obtained by sorting the steady-state probabilities of the underlying Markov chain.Finally,the efficacy of TSRPred is validated by simulation experiments based on large-scale real-world datasets.The experimental results shows that our approach is able to help the users to manage the large number of data-intensive services and provide the support to build high-quality data-intensive systems.
Keywords/Search Tags:Data-Intensive Services, Service Modeling, Business Process, Data, Quality of Service
PDF Full Text Request
Related items