Font Size: a A A

Design And Implementation Of Data Collection And Distribution In Telecom Operation Analysis Platform Based On Big Data

Posted on:2019-12-31Degree:MasterType:Thesis
Country:ChinaCandidate:W ChenFull Text:PDF
GTID:2428330590475163Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In recent years,researchers and academic institutions have done a lot of excellent research in big data processing.The traditional Hadoop-based system has been very mature.However,with the rapid changes in business forms,the continuous innovations of the mode of business application and the more personalized experiences of the user,the real-time processing of big data is becoming more and more demanding in business.Increasingly,real-time has become the most urgent requirement of big data processing ecosystem.The design and implementation of data acquisition and distribution in telecom operation analysis platform based on big data were presented in this thesis.The main works were as follows:Due to the large amount of data,high accuracy and high real-time data,the traditional Hadoop architecture had not satisfied the existing data acquisition and distribution requirements.In this thesis,in addition to the off-line batch processing of data using Hadoop,an open source,distributed,and highly fault-tolerant real-time flow computing framework,Storm,was selected to be restructured,and Kafka distributed message queues were selected as high throughput message queues to manage the realtime information that needs to be shared.A large data acquisition and distribution system which satisfied real-time and batch processing was implemented.The Oozie workflow engine was used as a unified management of data collection and distribution task to deal with the needs of both data flow task and workflow task.And using the characteristics of the configuration of all Oozie task parameters,the different processing modules are called according to the type of the task to realize the unified distribution of the data flow and the workflow taskFor the data acquisition and distribution of the system involved multiple target sources and destination sources,and the business logic rules of various tasks were complex,the data acquisition and distribution functions were componentized,and the data and calculation were separated.At the same time,according to the configured task file,the system could flexibly call various components and perform orderly execution according to the configured component flow,realizing data acquisition and synchronization functions from any source to any destination.The test results showed that the new architecture based on Storm streaming big data,combined with Kafka distributed message queue and Oozie task unified scheduling engine,was greatly improved in processing efficiency,real-time and scalability compared to the dataacquisition and distribution of traditional scheme.
Keywords/Search Tags:Big Data, Business Intelligence, Storm, Oozie
PDF Full Text Request
Related items