Font Size: a A A

Research And Implementation Of Real-time Banking Statistics Report Based On Hadoop

Posted on:2019-04-15Degree:MasterType:Thesis
Country:ChinaCandidate:W L DuanFull Text:PDF
GTID:2428330548973475Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In general,the production and operation of bank mostly show the data through the form of statements,and through the statistical analysis of the intelligence of the report data,it is useful for the bank's management to make a decision.The report system is a tool that makes use of computers to perform statistical analysis and production of reports.It plays a very important role in the production and management of bank.Through the study of a bank's reporting system,the author finds two problems.First,the system is unable to successfully complete the calculation of data,which is more than 10 million.Second,it is unable to query real-time report data.The bank's existing reporting system uses the stored procedure to calculate the data.With the transaction data increasing,when the data volume is too large,the method is too inefficient to meet the current demand for data calculations.Besides,when the bank queries the report data,the existing system performs offline calculation of data,and then directly queries the database to obtain the calculation results.But there are two problems with this solution.First,when the total amount of query data reaches more than 10 million,it not only fails to meet the requirements for fast inquiry,but also has great pressure on the database.Second,the data cannot be counted and queried in real time.For the above two problems,this thesis proposes the following solutions.First,for the problem that the data of tens of millions or more cannot be well calculated,in this thesis,the idea of "separation and smelting" is adopted,and the method of using HDFS to complete data storage and MapReduce parallel computing is proposed.The data is calculated and saved as an intermediate file,and this solution successfully achieved a large amount of data calculations.Second,for the problem that real-time report data statistics and queries cannot be queried,this thesis introduces the distributed indexing technology SolrCloud,and its efficient multi-concurrent query function is used to greatly shorten the report data query and statistics time.Through experiments,this method achieves the purpose of real-time report statistics and query.Through experimental test and analysis,the real-time statistical reporting system designed for the above problems has successfully completed the calculation of more than tens of millions of data,and can query the statistical results of the transaction data in real time,and it achieves the goal of the original system design.
Keywords/Search Tags:Hadoop, MapReduce, SolrCloud, Data encryption
PDF Full Text Request
Related items