Font Size: a A A

Design And Implementation Of Traffic Monitoring Target Big Data Analysis System Based On Spark

Posted on:2019-06-30Degree:MasterType:Thesis
Country:ChinaCandidate:Y LiFull Text:PDF
GTID:2322330545962529Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In recent years,the term big data in recent years has a high frequency of occurrence.Many industries and fields have made breakthrough progress with the help of big data related technology.Traffic as an import component of human behavior and one of the important conditions,the perception of big data is also the most urgent.The vehicles shuttling through the city hide huge amounts of information everyday.However,the potential value of intelligent transportation has not been effectively excavated at present.In this environment,the research and development of big data platform came into being.The traditional data analysis system has obvious performance bottlenecks in massive data storage and analysis,and a single computer with limited memeory and CPU has been unable to process large data.Compring the MapReduce distributed computing framework widely used in big data analysis of Hadoop,the Spark distributed computing framework based on RDD(elastic distributed data set)and memory computing model has better applicability.Based on the traffic monitoring target information data,this paper completes the design and implementation of traffic monitoring target big data analysis system based on Spark.In this paper,the theory and key technologies of big data are studied deeply,and the design and analysis ideas of the system are described in detail,including the design of system architecture and functional modules.The design of functional modules mainly includes the access,storage,analysis and display of huge amounts of data.On this basis,a traffic big data analysis system based on Spark is developed,which provides a system support for traffic data mining.Then,this paper describes in detail the reasons for the performance bottlenecks encountered in the process of data analysis,and proposes a cost optimization based and Bloom Filter based solution.In addition,an optimized configuration scheme for distributed message queues is proposed to maximize the throughput of the system.Finally,the relevant experiments are designed for the optimization scheme,and the effectiveness of the proposed scheme is verified by a large number of real data experiments,and good results are achieved after optimization.
Keywords/Search Tags:big data analysis, traffic monitoring target, kafka, spark sql, cost optimization
PDF Full Text Request
Related items