Design And Implementation Of High Reliability Columnar Storage Engine For Distributed Memory Database

Posted on:2022-04-18

Degree:Master

Type:Thesis

Country:China

Candidate:S Ni

Full Text:PDF

GTID:2518306524493424

Subject:Master of Engineering

Abstract/Summary:

With the large-scale popularization of mobile Internet and Internet of Things devices,the world has entered a post-information society,and the 21st century is the century of big data.The ever-increasing demand for massive data storage,processing and analysis makes traditional databases no longer meet the requirements,and distributed databases have emerged.After years of development,the current distributed databases have three major research directions.The first is the concept of NewSQL,which uses Paxos or Raft consensus algorithms to provide data with high availability and strong consistency of distributed transactions to meet users’ consistency requirements in distributed scenarios;the second is Sharding technology,which is based on MySQL’s years of technical accumulation provide users with stable database services;the third is cloud-native database.Through the storage and computing separation architecture combined with cloud virtualization technology,storage and computing resources are regarded as resource pools to achieve rapid horizontal expansion and contraction of distributed databases,reduce usage costs and obtain higher performance.Distributed memory database with columnar storage stores data in columns.Since each column data type is known and the same,the system can compress it to a high degree.When accessing data,you can also access only the columns involved to reduce System I/O,the parallel processing capability of the processor can also be used to improve efficiency during calculation.This thesis is based on the database self-developed by the teaching and research office—GoldFish in-memory database,and is oriented to OLAP(online analytical processing)scenarios,designs and implements column storage engine module based on the Raft protocol,which not only complements the reliability of the system,but also implements SDO(Slice Data outline),which improves query performance.The main contents of this thesis are as follows:1.The use of multi-copy mechanism supplements the reliability of data,so that the system can recover data from other nodes when the physical node crashes or the persistent memory is damaged,which solves the single-point failure problem and makes the system data highly reliable.2.The use of multi-copy mechanism supplements the reliability of data,so that the system can recover data from other nodes when the physical node crashes or the persistent memory is damaged,which solves the single-point failure problem and makes the system data highly reliable.3.Implemented column fragmentation data index to improve query performance,so that the system can get results faster when running multi-condition query statements,aggregate query statements,and query statements with Join.

Keywords/Search Tags:

Distributed Database, Slice Data Outline, Consistency Protocol, Columnar Storage, OLAP

Related items

1	Olap Queries Based On P2p Distributed Storage Technology Research And Implementation
2	Design And Implementation Of Optimization Method For Distributed Columnar In-Memory Database Storage Engine
3	Design And Implementation Of NVM And SSD Oriented Columnar Database Storage Engine
4	Design And Implementation Of Columnar Storage System For User Behavior Logs
5	Design And Optimization Of Data Consistency Protocol For Distributed Storage System
6	An OLAP-Oriented Distributed Key-Value Storage Engine
7	Design And Implementation Of Query Optimizer For Massive Distributed Columnar Database
8	Research And Application Of Layered Distributed Consistency Protocol
9	The Research On WAN Distributed Storage Technologies Based On P2P Architecture
10	Design And Implementation Of Transaction System Based On Distributed Columnar In-memory Database