Font Size: a A A

Research On Big Data System Supporting Multiple Computing Model

Posted on:2016-06-26Degree:MasterType:Thesis
Country:ChinaCandidate:Y ChenFull Text:PDF
GTID:2308330473954432Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the advent of the BigData era, BigData technology is developing rapidly.Typical change is the computing model developing from the initial batch to stream computing and real-time interactive mode. Different computing framework has its own applicable area. Batch computing can easily handle large scale data but it has long response time. Steaming computing is a continuous computing mode and can give user a quick response for an event. With the need of complex scenarios, single computing framework can not satisfy the requirements. There have been some studies on hybrid computation model.Our research is mainly in building a system supporting multi-computing framework. The integration of different frameworks will face a variety of problems, including the upper interface of the system is not unity, diversity, unified cluster resource scheduling problem of heterogeneous systems, and so on. To solve this, we focuse on the research of unified abstraction language layer, multi-computing framework supporting, compiler optimization, cost-based model. We implement a prototype system. At the user layer, we use SQL as query language to improve the ease use of the system. Currently system supports both batch and streaming processing. To enhance the real-time query performance, we use HBase at storage layer. We propose a cost evaluation model for MapReduce and Storm that can for picking computing framework smartly.According to performance tests, whole performace is better than Hive and close to SummingBird. As the framework picker is smart, we can make a balance between bacth and streaming computing. For large-scale data, compared with SummingBird on Storm, its throughtout has been increased by 16% to 20%. When it comes to streaming computing framework, the speed is 20%~40% faster than Hive.
Keywords/Search Tags:Big Data System, Computing Framework, Hybird System, Hadoop, Storm
PDF Full Text Request
Related items