Design And Implementation Of Real-time Processing Architecture For Big Data Based On Storm

Posted on:2019-05-04

Degree:Master

Type:Thesis

Country:China

Candidate:Z M Zhao

Full Text:PDF

GTID:2428330566497313

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

In today's society,the mining and use of massive data are becoming more and more frequent.In practical scenarios,real-time data often needs to be processed and analyzed and feedback timely.In the beginning,many enterprises in order to quickly respond to business requirements,the solution is to use Redis news publish and subscribe features,combined with the List,Sorted Set,Hash structure features of data for processing,finally through the socket feedback.This approach relies heavily on Shared memory,and as the amount of data grows,it is clearly not appropriate to rely on machine memory.Therefore,in order to meet the application requirements of high concurrency,big data and high real-time,this paper designs and implements a big data real-time processing architecture in accordance with current situation.This paper relies on the functional requirements of the three subsystems of advertising analysis,promotion analysis and coupon analysis under the theme of marketing analysis in the actual business scenario.According to the data flow,this paper divides the architecture into five layers: message middle layer(data acquisition),infrastructure layer(real-time processing),data storage layer,service layer and application layer.This paper starts the technical selection around the five-layer structure,and finally designs and realizes the low coupling,high expansion and reusable processing architecture.First of all,in the data acquisition phase,the message queue is built based on KAFKA to realize the cache area,so as to avoid data lag loss caused by data blowout growth.Secondly,a flow processing framework was constructed based on Storm,and a distributed data processing network was formed to solve the complex problem of traditional message queue control.Then,My SQL,HBase and Elastic Search are selected to realize combined storage of multiple data sources in consideration of data features and economic costs.Finally,to optimize query efficiency,distributed SQL queries are implemented based on Presto.In this paper,we study the architecture of the after nearly a year of analysis,design,development,debugging and testing of multiple links such as repeated verification,starting in October last year,has gradually replaced to use the online environment and good effect,fully proved its availability,stability and high performance.

Keywords/Search Tags:

Big Data, Real-time processing, Storm, KAFKA, ElasticSearch

PDF Full Text Request

Related items

1	Design And Implementation Of Real-time Log Stream Processing System Based On Kafka And Storm
2	Order Big Data Real-time Monitoring System Based On Storm
3	Design And Implementation Of Log Big Data Service Platform Based On ElasticSearch And Storm
4	The Design And Implementation Of Real-time Processing System For Device Log Stream Data Based On Storm
5	A Real-time Analysis System Of Potential Demand For Weibo User In Storm
6	Design And Implementation Of STORM-based Real-time Network Analysis System
7	The Design And Implementation Of Log Analysis Based On Storm
8	Research Of Monitor System For Business Layer Based On Storm In Big Data Environment
9	The Design And Implementation Of The Storm-based Real-time Marketing Activity Monitoring And Anti-fraud Platform
10	Research And Implementation Of Big Data Real-time Processing System Based On Storm