Research And Implementation Of Data Governance System For Big Data Credit Reporting

Posted on:2023-12-31

Degree:Master

Type:Thesis

Country:China

Candidate:J Yue

Full Text:PDF

GTID:2568306914477484

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

At present,the rapid development of Internet technology,Cloud computing and Internet of Things technology has made human society enter the era of big data.In the context of the era of big data,big data credit reporting applies big data technology to the credit investigation industry,changing the way of data collection,processing and analysis.At the same time,data of higher dimensions and different levels are used for credit score modeling,and the potential value of data is constantly mined.However,the application of massive data also brings some challenges to big data credit reporting:(1)data usually comes from different institutions,has different formats,and has the characteristics of multi-source heterogeneity.However,the existing data synchronization tools have poor universality and need to be improved in real-time incremental synchronization.(2)It is difficult to trace data lineage:The introduction of big data components such as Spark and Flink makes the data processing process strongly associated with the computing engine,and conventional methods are not accurate enough,increases the difficulty in extracting data lineage.(3)Poor data quality:data recording is arbitrary,with data from logs,texts and other formats,and data integrity and standardization are not guaranteed.In view of the problems of difficult data aggregation,poor data quality and difficult data traceability in credit data reporting,in order to expand the scope of credit investigation data integration,detect the quality of credit investigation data,and better play the value of credit investigation data,this paper plans to design a data governance system for big data credit reporting by studying key technologies of data governance.It mainly includes the following research contents:(1)Proposed and implemented data synchronization job construction methods and tools that support offline and real-time data.Research on data aggregation methods and technologies of multi-source data,design and implement a construction method and system that can simultaneously support offline and real-time data synchronization operations,optimize the configuration process of data synchronization operations,and realize unified configuration of multiple data synchronization methods.(2)Proposed and implemented a lineage analysis method for Flink SQL.Aiming at the defects of high coupling,high invasiveness and poor accuracy of existing consanguinity analysis methods,this paper studied and implemented local parsing of Flink SQL,verified and replaced its parse tree with metadata,and realized low invasiveness of consanguinity analysis function and accuracy of parsing results.(3)Design and implement a data governance system for big data credit reporting.By studying the relevant concepts and technical schemes of data governance,the data governance system for big data credit reporting is designed and implemented to realize the integration and synchronization of multi-source data,improve the quality of credit investigation data through data governance,and provide good data support for the data analysis and research of individual or enterprise credit investigation business.This paper finally achieved a data management system,and this system provides the metadata management,data synchronization,data quality management support.After verification and testing,the system achieved in this paper meets expectations,has good versatility and scalability,it has been applied and verified in "Intelligent Evaluation and Open Platform of Big data credit reporting" in the National Key R&D Program of China "Big Data Credit Investigation and Intelligent Evaluation Technology",and has certain reference significance for data governance in the big data creditc industry.

Keywords/Search Tags:

data governance, data synchronization, data lineage, metadata management, big data credit reporting

PDF Full Text Request

Related items

1	Design And Implementation Of A Metadata Service Management Platform For Multi-source Heterogeneous Big Data
2	Research And Implementation Of Multi-source And Heterogeneous Data Governance Platform
3	Design And Implementation Of Big Data Integrated Storage And Governance System For Multi Scenarios
4	Design And Implementation Of Data Asset Management Platform Of The Manufacturing
5	Design And Implementation Of Enterprise Data Governance Platform
6	Inventory data warehouse system with lineage tracing for small retail chain
7	Data Governance And Application Based On Campus Data Center
8	Design And Implementation Of Data Warehouse Management Module For Mobile Reading Platform
9	Design And Implementation Of Data Governance System Based On Hadoop
10	Constructional Method For Standard Data Management Platform Of Financial Assets Industry