Font Size: a A A

The Design And Implementation Of Real-time Processing System For Device Log Stream Data Based On Storm

Posted on:2020-06-20Degree:MasterType:Thesis
Country:ChinaCandidate:Y Z XueFull Text:PDF
GTID:2518305732497924Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
With the rise of IoT technology,machines and sensor devices in industrial enterprises always generate a large amount of device log data.These data are characterized by real-time and endless.Industrial companies need to dig deeper into the potential value of these device log stream data to improve business efficiency.Currently,industrial enterprises used device log data processing system are based on Hadoop.These systems obtain static log data directly from the database and process the data offline.Although the throughput of the system is high,the response time is not guaranteed.It is only suitable for batch processing of massive static log data.But these systems cannot meet the real-time requirements of processing device log stream data.Storm is a distributed realtime computation system and makes it possible to process large-scale device log stream data in real time.But the storm uses a single control node to achieve task allocation,code distribution and monitoring for the entire cluster.when the control node is shut down,the topology task submitted to the cluster cannot be executed,so the system has a single node failure problem.In this thesis,a real-time processing system of device log stream data based on Storm is designed and implemented,which solves the real-time problem and the failure of a single control node in the current device log data processing system.Firstly,for real-time problems,the system uses Storm as the framework for log stream data processing,eliminating the batch data collection time and job scheduling time,so that the response time of the system is shortened.At the same time,when the system processes the log stream data,the sliding time window is added,and only the log stream data in the time window is processed to ensure the real-time processing data.Secondly,for the single point failure problem,the system adopts the optimization cluster of the master-slave multiple control sections and ensures the information synchronization between the multiple slave control nodes and the current master control node.The cluster system utilizes the coordination mechanism of the Zookeeper cluster,and after the control node is down,the new primary control node is generated from the remaining slave control nodes by using the election mode to ensure the continuous operation of the topology task,which solves the single node failure problem in the cluster system.The system visually displays the results of real-time processing to the workshop staff of industrial enterprises,aiming to enable the workshop staff to complete the real-time tracking and monitoring of the running status of the workshop equipment.By using this system,the workshop staff of industrial enterprises can easily and intelligently manage the workshop machines.Ultimately,the system can help companies improve efficiency and save costs.
Keywords/Search Tags:Streaming Calculation, Big Data, Storm, Kafka, Zookeeper, HBase
PDF Full Text Request
Related items