Font Size: a A A

Design And Implementation Of Transaction Data Analysis Platform Based On DRC

Posted on:2021-03-25Degree:MasterType:Thesis
Country:ChinaCandidate:Z L YangFull Text:PDF
GTID:2428330647963659Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The development of Internet technology and the popularization of online trading platforms and applications have resulted in the generation of large amounts of transaction data,and the importance of data analysis within enterprises has become increasingly prominent.However,most transaction data has formed data islands between different business systems due to massive heterogeneity,multi-source dynamics,and limitations of traditional software architectures.At the same time,the data analysis of each business system is scattered,the scope of use is narrow,and it lacks versatility.Fragmented applications such as e-commerce internal business systems,App systems,and PC applications make transaction data difficult to manage,and enterprises lack a unified data management center and data analysis platform.Therefore,breaking data chimneys and information islands,achieving unified management and analysis of transaction data,and obtaining hidden information resources behind data to achieve market value have become urgent problems to be solved today.In response to the above problems,DOA(Data Oriented Architecture)provides a good solution.DRC(Data Register Center)as the core component of DOA,through the unified registration of data information to form a logical resource pool,to achieve centralized management and sharing of fragmented transaction data.This article builds a comprehensive platform for transaction data management and analysis based on DRC,supports real-time processing analysis and offline calculation,meets various business scenarios and needs of transaction data analysis,and makes it versatile and stable.The research content of this article is as follows:(1)Based on the idea of DOA in data management,analyze the basic attributes and characteristics of registration data,and study the general metadata registration specification for structured data.(2)According to the characteristics of the unified registration management data of the data registration center and the provision of external data access services,study the registration method for structured data and the implementation plan of the DRC data registration center.(3)Based on the Spark Streaming real-time streaming data calculation framework,real-time statistical analysis of transaction data is realized.The transaction data analysis platform constructed in this paper obtains the source data indirectly through the DRC data registration center,and uses Canal and Kafka to collect and transmit the source data in real time,and introduces Web Socket full-duplex communication protocol to achieve low latency and fast data visualization response.(4)Based on the Hadoop batch computing framework,build an offline computing module to provide calculation and analysis services for long-term accumulation of large batches of static transaction data.At the same time,based on the offline computing module,the implementation plan of personalized recommendation service is studied.(5)To optimize the performance of Spark Streaming real-time statistical analysis of the platform,a progressive batch interval dynamic update strategy was designed to improve the traditional static configuration of batch interval to dynamic adjustment to make batches The size of the time interval changes with the load,which improves the performance of the platform and maintains the stability of the platform.The research results and innovations of this article are as follows:(1)A DRC structured data registration specification is proposed.Analyze the attributes and characteristics of structured data,design a general specification for structured data registration based on metadata,and propose the corresponding registration method and data access service implementation plan.(2)Designed a transaction data analysis platform based on DRC that supports realtime statistical analysis and offline calculation.The platform realizes the integration of transaction data management and analysis,and has versatility and stability.(3)A progressive batch time interval dynamic update strategy is proposed.The traditional static configuration batch time interval of the Spark Streaming program has been improved to dynamically adjust,which can well adapt to load changes and improve the performance and stability of the platform.
Keywords/Search Tags:Data analysis, Data management, DRC, Spark Streaming, Hadoop
PDF Full Text Request
Related items