The Design And Implementation Of Streaming System In Kylin Framework

Posted on:2018-04-15

Degree:Master

Type:Thesis

Country:China

Candidate:G Wang

Full Text:PDF

GTID:2348330515492957

Subject:Engineering

Abstract/Summary:

PDF Full Text Request

With the arrival of big data age,a variety of big data framework continues to develop,memory computing and streaming computing framework has been more and more popular.The traditional OLAP engine,using the batch processing model,has been unable to meet the user’s performance requirements.Therefore,how to add the streaming processing or memory computing capability to the traditional OLAP engine,reducing the query response latency,has become a hot topic in the big data field.As an open source OLAP engine,Apache Kylin leverages the distributed computing framework Hadoop,and adopts the way of pre-process,which provides response in sub-second for most query.However,Kylin is actually a batch framework.The Kylin Streaming system,as a sub-module of Apache Kylin,adds the ability of streaming processing to Kylin systems,effectively reduces the latency of the pre-process.The Kylin Streaming system uses the streaming processing model,and Kafka as the streaming data platform.Data is aggregated in memory before cube building.The results of the aggregation can respond to the user’s query,which greatly reduces the query response delay.And then,the Streaming Cubing Engine module builds data cube based on the aggregation results,which as well meets the massive data query requests.With the Kylin Streaming system,Apache Kylin has become an OLAP engine with both off-line processing and real-time processing to meet the needs of users in different use case.

Keywords/Search Tags:

streaming processing, Kafka, data cube, dimension, pre-process, Hadoop

PDF Full Text Request

Related items

1	A Distributed Cache And Analysis Platform For Large Scale Streaming Data Based On Kafka
2	OLAP Algorithm Research Based On Dimension Hierarchy For Data Cube
3	Design And Implementation Of Kafka-based Full-Link Stream Data Processing Platform
4	Data Cube Implementation Of Dimension Frequent Itemset
5	The Research Of Log Processing Platform Based On Apache Kafka
6	Design And Implementation Of Parallel Processing Platform Of Video Base On Saprk Streaming
7	Big Data Flow Processing Analtsis System Based On Kafka
8	Research On The Storage Technique Of Data Cube Based-on Dimension Hierarchy
9	Design And Implementation Of Closed Histogram Cube Based On Hadoop
10	Research And Implementation Of Test Data Processing System Based On Spark Streaming