Design And Implementation Of Real Time Computing Platform For E-commerce Transaction Data

Posted on:2021-01-05

Degree:Master

Type:Thesis

Country:China

Candidate:Z H Chen

Full Text:PDF

GTID:2428330611965911

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

Real-time computing also known as stream processing,is a process of generating from the data and collecting it in real time,which can meet the needs of faster calculation and analysis.The emergency and development of big data technology provides the new solutions for the processing of massive data.The offline batch processing system originally built based on Hadoop is implemented by HDFS storage,Map Reduce,and Hive computing modes.This pattern is often used to build the large data warehouse currently.The traditional data warehouse-based query statistics synchronizes the data of each business system to the data warehouse,forms the ods layer data set,and develops statistics on this basis.Data warehouse often store T-1 data,which cannot meet the needs of real-time data processing.Although the scheduling time of data can be adjusted to 30 minutes to several hours,characterized high throughput and time-consuming,this method can meet the requirements of offline analysis reports with low timeliness.However,more and more application scenarios have not met the requirements only through offline batch processing.Enterprise expect to perform online real-time processing and instant response to streaming data,because the value of data is time-sensitive.The emergence and development of real-time computing technology has accelerated the processing of data and made data more valuable.Based on the Blue Moon's requirements for real-time processing of e-commerce transaction data,buried point data,and manufacturing data,we discusse the design and implementation of building a unified big data real-time computing platform.The data processing flow of the real-time computing platform is as follows: first,analyse the My SQL binlog and write to the local file through canal.Second,Flume collects and sends the data to Kafka,the distributed message queue.Finally,Storm,the distributed real-time computing framework,pulls the data from Kafka for dealing,and writes the dealing results to redis,the high-performance cache component.In addition,the platform also built a distributed index service elasticsearch to meet the needs of real-time indexing,as well as elasticsearch stores and analyses the dirty data encountered in the process of real-time computing.

Keywords/Search Tags:

Real-time computing, Storm, Binlog, Canal, Kafka

PDF Full Text Request

Related items

1	Design And Implementation Of Real-time Log Stream Processing System Based On Kafka And Storm
2	Design And Implementation Of Real Time Log Analysis And Storage System Based On Storm
3	A Real-time Analysis System Of Potential Demand For Weibo User In Storm
4	Design And Implementation Of Real-time Traffic Information Management System Based On Storm
5	Design And Implementation Of STORM-based Real-time Network Analysis System
6	Design And Implementation Of Real-time Processing Architecture For Big Data Based On Storm
7	The Design And Implementation Of Log Analysis Based On Storm
8	Design And Implementation Of SMS Fraud Based Interceptor Storm System Prompts
9	Order Big Data Real-time Monitoring System Based On Storm
10	The Design And Implementation Of Real-time Processing System For Device Log Stream Data Based On Storm