Font Size: a A A

Design And Implementation Of Real-time Computing System For Yunnan Highway Network Based On Spark Streaming

Posted on:2021-02-13Degree:MasterType:Thesis
Country:ChinaCandidate:L WangFull Text:PDF
GTID:2518306245481964Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The transformation of traditional industries to digitalization is the trend of enterprise development in recent years.Enterprise data resources have grown rapidly with the application of various information systems in enterprise production activities.ERP,CRM,HR,OA and other off-ice systems have gradually become the standard configuration of enterprise informationization construction.The resulting massive data is stored in the database only as a trade secret,and does not play its due role as a data resource.In recent years,the rapid development of middleware technology has gradually formed a process of automatic data collection and processing,and the management of data resources in the form of data warehouses,thereby forming reference data assets for operational decisions,highlighting The creative business value of data resources.In 2019,this systematic enterprise digital transformation project is called "data center",and the "big data computing platform" described in this article is an important part of "data center".The amount of data generated by multiple information systems of highways in Yunnan Province is large and diverse.There is an urgent need for a big data real-time computing platform to collect production system data and provide full-process data calculation services to meet traffic data analysis and revenue operation analysis.Various requirements for auditing of pass records.The highway networking system of Yunnan Province currently uses the Hive offline calculation method for data analysis and external audit calculation of traffic records.This method requires waiting for data to be collected and stored at regular intervals,and then the hive offline calculation is used for batch data processing,tasks are queued,and finally calculated.As a result,the disadvantage of this method is that the data transfer cycle is too long and cannot support the application scenario of data calculation in time.In the real-world scenario of PB-level data association calculations,taking the Hive offline computing platform of the Yunnan Highway Networking System as an example,at least a 12-hour circulation cycle is required to complete the calculation of the batch task result data.In order to solve the problem of long data acquisition and processing cycle and high delay of the existing hive offline computing platform of the Yunnan Highway Networking Project,this article conducts in-depth research on the big data computing platform and its internal computing technology.The main research contents are as follows:I.Study Yunnan The data characteristics of the data production subsystems of the provincial highway network,explore the availability and operability of real-time computing scenarios such as traffic data analysis,revenue operation analysis,audit data,and other real-time calculation scenarios in the Yunnan highway network system.2.By studying the stream data processing and calculation systems in various industries,a real-time calculation system based on the traffic field scenario is designed and implemented for the highway network system in Yunnan Province.3.Based on the results of the previous two studies,this article under the spark environment,analyzes the requirements of technology to technology for the real-time computing application scenarios of highway tolls in Yunnan Province,systematically designs and finally implements a real-time computing platform.Including stream data management,stream data processing calculation,data query and data factory four modules.The platform modularizes specific data processing operations commonly used in the field of highway network toll collection,and solidifies the stream data processing flow,abstracting it into stream operations for systematic operation.The front-end page of the platform is designed with drag-and-drop components to facilitate the use of business operators.The real-time computing platform based on spark streaming researched and implemented in this article has completed production development in the Yunnan Highway Networking Project.On January 1,2020,after the cancellation of the provincial toll station switchover across the country,it was officially launched to support daily business analysis and road operation monitoring.Audit and rectification of high-speed traffic stealing and fee evasion.
Keywords/Search Tags:Real-time computing, stream processing, highway networking
PDF Full Text Request
Related items