Font Size: a A A

Resource management for data streaming applications

Posted on:2011-01-09Degree:Ph.DType:Thesis
University:Georgia Institute of TechnologyCandidate:Agrawalla, Bikash KumarFull Text:PDF
GTID:2448390002950089Subject:Computer Science
Abstract/Summary:
This dissertation investigates novel middleware mechanisms for building streaming applications. Developing streaming applications is a challenging task because (i) they are continuous in nature; (ii) they require efficient transport of data from/to distributed sources and sinks; (iii) they need access to heterogeneous resources spanning sensor networks and high performance computing; and (iv) they are time critical in nature.;One common characteristics of these applications is data fusion. I present a novel programming abstraction, called DFuse, that makes it easier to develop fusion applications. The application program is specified as a dataflow graph with fusion points. DFuse middleware instantiates the graph on distributed resources and subsumes issues inherent in distributed programming---such as failures, partial fusion, buffer management, and synchronization. Through experiments, I demonstrate that DFuse API implementation has reasonable overhead.;I also address the challenges involved in allocating high performance computing resources for these applications. The scheduling framework consists of a heuristic algorithm, called Streamline, for placement of streaming application dataflow graph on HPC resources. I demonstrate the performance benefits of Streamline in a controlled environment through simulation as well as in wide area environment using Planetlab. Also I demonstrate that the scheduling algorithm can be implemented as a grid service and be deployed in wide area environment.;While Streamline does the placement for such streaming applications well, the application dynamics may result in the computation and communication characteristics of the application changing over time. I present a Distributed Scheduling heuristic and a Periodic Streamline algorithm to address the limitations of Static Streamline algorithm. The performance of Distributed Algorithm is compared with Periodic Streamline and Static Streamline. Through micro measurements, I show that the Distributed Algorithm performs close to Periodic Streamline and 6x better than Static Streamline under dynamic resource availability. Through scalability study, I also show that the Distributed Algorithm performs close (within 5%) to Periodic Streamline algorithm with much less (7.5x less) overhead.;Finally, using a case study of such data streaming and ubiquitous application and the experience gained via building it, we propose a taxonomy of ubiquitous computing stack called UbiqStack. UbiqStack consists of five orthogonal functionalities of most commonly occurring subsystems for ubiquitous applications. Through the lens of the UbiqStack taxonomy, we survey a variety of subsystems designed to be the building blocks from which sophisticated infrastructures for ubiquitous computing can be assembled.;In summary, I develop Fusion Channel programming abstraction that makes it easier for domain experts to build data streaming applications. An application only needs to specify the input and output connections to fusion channels, and the fusion functions. The subsystems developed in this dissertation take care of instantiating an application, allocating resources for the application (via scheduling heuristics) and dynamically managing the resources (via dynamic scheduling). Through performance evaluation, I demonstrate that the resources are allocated efficiently to optimize the throughput and latency constraints of an application. Through extensive micro measurements and scalability studies, I have established my thesis: "An intuitive programming abstraction will make it easier to build dynamic, distributed, and ubiquitous data streaming applications. Moreover, such an abstraction will enable an efficient allocation of shared and heterogeneous computational resources thereby making it easier for domain experts to build these applications."...
Keywords/Search Tags:Applications, Resources, Build, Streamline, Easier, Distributed, Algorithm
Related items