Font Size: a A A

The Research And Implementation Of Real-Time Data Ingestion On Claims System

Posted on:2018-08-21Degree:MasterType:Thesis
Country:ChinaCandidate:K YuFull Text:PDF
GTID:2348330512481306Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With development of Internet applications,facing the challenges stemming from storage and management of massive data set,companies including finance start developing distributed database management system to improve the capacity of data processing and system's extensibility.In order to enhance the data valxe that can be increased by reducing the latency on ingesting and analyzing over newly generated data set,industries require database systems to support distributed real-time data ingestion and real-time data analysis,especially transactional data ingestion and analysis to promise the correctness and consistency of data.Nevertheless,traditional transactional data processing in the distributed environment almost is implemented based on locks and two-phase commit protocol,which is difficult to achieve real-time data ingestion.In the other hand,NoSQL system fail to achieve transactional data ingestion.CLAIMS is an open-source distributed in-memory OLAP engine,which offers high query performance with SQL interfaces.However,it lacks the capability of the transactional data ingestion and fails to meet the requirement from the financial industry in its current state.In paper,we proposed and implemented the real-time data ingestion framework with transactional concurrency control mechanism leveraging on the separation of the metadata and physical data to achieve the non-blocking data ingestion in real-time processing.We conduct an extensive set of experimental studies,and confirm that our proposed framework satisfies the requirements on real applications.The major contributions of this paper are as follows:1.We proposed a metadata-oriented transactional concurrency control mechanism that separates the control flow of metadata and physical data to achieve the non-blocking data ingestion.It achieves the isolation level of sequential by updating the metadata atomically between data ingestion transactions.It separates writing and reading transactions by reading snapshot and achieves strong consistency in transactional processing.2.Based on the framework of our proposed transaction manager,we designed and implemented the transactional distributed real-time data ingestion engine in CLAIMS.The data ingestion framework uses transactional processing to promise the correctness of data as well lock-free structures to achieve non-blocking and real-time of data ingestion,which lead to high throughput and low latency of data ingestion.3.We conducted a series of extensive experimental studies and compared with VoltDB,which confirmed that our transactional distributed real-time data ingestion framework satisfies transactional characteristics and provides high-throughput and low-latency data ingestion service.The metadata-oriented transactional concurrency control mechanism we proposed is novel and has high academic value.The extensive experimental studies confirm its value in the real applications.Our work can provide reference for other database systems.
Keywords/Search Tags:real-time data ingestion, distributed transaction, database, CLAIMS
PDF Full Text Request
Related items