Font Size: a A A

The Design And Implementation Of User Label System Based On Hadoop

Posted on:2022-06-14Degree:MasterType:Thesis
Country:ChinaCandidate:H X ZhangFull Text:PDF
GTID:2518306740983289Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet finance,online credit systems have been launched in banks and Internet companies.Unlike traditional offline credit,online credit business has much larger user group and more frequent transactions,which promotes the continuous accumulation and realtime changes of business data in credit systems.This business data contains multi-dimensional characteristics with high commercial value.Therefore,how to analyze and process it in real time has become an urgent requirement of many enterprises.At present,traditional data analysis methods cannot be applied to massive data scenarios,and offline processing technologies such as data warehouses and ETL tools cannot meet the requirement of real-time processing.For the scenarios of real-time processing of massive credit user data,user label system based on Hadoop technology is designed and developed.Main contributions are listed as follows:(1)User labels in credit business scenario are designed for the description and storage of user characteristic information.And on this basis,a real-time big data processing system based on Storm in-stream processing and HBase massive data storage technology is developed,which can provide user label construction and verification functions.In addition,the real-time synchronization function of user labels is developed based on HBase Coprocessor technology.User label data can be synchronized to the Elasticsearch search engine,and data analysis and visualization functions are implemented with the introduction of Kibana.(2)To meet the demand that user label system services for heterogeneous credit systems and credit business,common processes are extracted.And in view of the differences in business data collection,cleaning and label calculation of various credit systems,configurable DBMS/Kafka data sources,the interface "Meta Data Converter",and the abstract class "Computable User Label" are provided for expansion,which improves reusability and scalability of user label system.(3)To meet the requirement of data accuracy,message retry functions in multiple scenarios based on Storm ack mechanism are designed and implemented,Redis distributed pessimistic lock and version number mechanism based optimistic lock are introduced to service for label construction and verification scenarios.These contributions above ensure the reliability and concurrency correctness of data processing.Based on the key technologies above,a real-time processing system for massive user label data is designed and implemented,and the development of accessibility from the credit system "a certain loan platform" is also completed.After the deployment,test and verification,user label system docks with "a certain loan platform" system,meeting expected requirement.
Keywords/Search Tags:Hadoop, User Label, Credit, Storm, HBase
PDF Full Text Request
Related items