Font Size: a A A

Design And Implementation Of Big Data Analysis Platform For Commercial Banks

Posted on:2019-12-13Degree:MasterType:Thesis
Country:ChinaCandidate:T Y BuFull Text:PDF
GTID:2428330590995856Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
A commercial bank is actually a big data company,and one of the future transformations is to become a data-driven rather than a capital-driven company.Bank data is highly confidential and sensitive information.If used properly,it can become a gold mine for banks.Manage and utilize data information as an asset across the enterprise,and promote innovation and transformation of the company,we can enhance the competitive advantage of the bank.How should commercial banks collect data,use data,self-analyze data,build unique competitive advantages,and open the door to the evolution of smart finance in the future? This is the significance and purpose of this research.Firstly,in view of the above background and problems,this thesis studies the principle of big data technology.A big data analysis platform based on Hadoop and Spark is designed and built according to production requirements.It has universality and stability,and includes functions such as offline batch processing and real-time stream processing.The system design and implementation process is detailed in the thesis.Then,the offline computing module is designed and implemented.In this part,the optimal data collection scheme of relational database is studied firstly.In the data cleaning process,it focuses on how to implement data deduplication in Hive.HiveQL queries the data warehouse table,and then combines with Mysql and OLAP to provide the data analysis service function.Moreover,it puts forward the optimization idea of OLAP data model.Next,a real-time computing module is designed and implemented.The main research is how to realize the process of conversion,acquisition and calculation of the server real-time log stream.The architecture of Flume+Kafka+Spark Streaming and the method of obtaining Kafka data without loss are mainly designed.In addition,the method of real-time processing of generateJobs in SparkStreaming is studied in detail.Finally,the data visualization module based on ECharts is researched and designed,and combined with JavaScript,Ajax,PHP and other technologies.After testing,the function and performance of the system meet the design requirements.At present,the system has been applied in actual production,and achieved good results.
Keywords/Search Tags:Commercial Bank, Big Data, Hadoop, Spark, Visualization
PDF Full Text Request
Related items