Font Size: a A A

The Implementation Of The Storage Management Of Main Memory Database

Posted on:2010-09-16Degree:MasterType:Thesis
Country:ChinaCandidate:F JinFull Text:PDF
GTID:2178360272496383Subject:Software engineering
Abstract/Summary:PDF Full Text Request
As a new database, main memory database is widely used in telecommunications, financial and other domains that require high performance nowadays. Main memory database reduces disk I/O, and uses of a unique index structure, implements the memory-based optimization strategy, these advantages make it competitive with the disk-storage database to occupy a place, and there is a trend to replace the disk-storage database. GBase 8a database system is a multi-functional, high performance and high availability main memory database, with superior performance, low cost resources, functions, etc., it makes a significant contribution to the development of domestic databases. As the core module of data access of databases, its response time directly determines the performance of main memory databases, and there's an important significance in enhancing the competitiveness of the market.Our database system uses column-storage structure which can minimize the CPU cycle and shorten the time of data retrieval, and this structure is in favor of compression.Storage Management module consists of main-memory management, disk management and the synchronization of data between memory and disk. In the main-memory management case, we use struct COL to store the attributes and data of a column. And the struct MEM_BLK, which is defined in COL, is used to store column data sequentially, including rowid and value, which also includes fix_value and var_value. When using var-sized storage, fix_value stores the index of real data which is stored in var_value. While using fix-size storage, real data is stored in fix_value, and var_value is null. On the aspect of disk-synchronization, the system uses MMAP operation which is provided by operating system, so that it can simplify some additional operations and can improve the performance. However, as the memory expansion is too fast, it is too late to swap memory out when the memory is full. So GBase 8a adds its own disk-synchronization mechanism, to ensure the memory rate is lower than 80% most of the time, so that data can be loaded into memory from disk quickly.This paper takes Decimal as an example and introduces the fix-size storage and var-sized storage in GBase 8a. For the fixed-size storage of Decimal types, according to a range of Decimal types, uses the char, short, int and long long types of storage, and for var-sized storage of Decimal types, designs the storage method for mapping between the SQL types and C-language types, and gradually improved the design of var-sized data storage mechanism: from sequential storage to HASH table storage, and finally to extensible HASH table storage, which eliminates the duplicate data and reduce the times of data comparison operation during a query, so that query performance has been significantly improved.In order to be able to adapt to a variety of databases and Decimal type of SQL statements, it is necessary to implement a function of type Decimal. There are basic functions, type conversion functions, arithmetic functions, aggregate functions and the scalar (mathematics) function. The basic functions must be implemented by all data types and they are the most basic operations such as comparison operations, the conversion between this type and string and the determination of whether the data of this type is null. This paper shows the implementation of the conversion between var-sized Decimal and string. Decimal type conversion function can be divided into two categories: the conversion between Decimal and other types and the inter-conversion between Decimals. This paper shows the implementation of the conversion between two Decimal types. There're different implementations of arithmetic functions between fix-size storage Decimal and var-size storage Decimal: for the fixed-size storage of Decimal calculations, only take the arithmetic operations between their storage type, and then find the location of the floating point; for the arithmetic operation between var-size storage Decimals, namely operating two 10^10 radix numbers. This paper shows the implementation of every arithmetic operation and the examples of each. The division operation, to enhance the performance of dust removal, uses the Knuth algorithm. Aggregation function is relatively simple to implement, just accessing the data which is selected by loop. For Scalar (mathematics) function, it will be converted to float or double type and take that result. With the implementation of these functions, the group by, order by, the primary key, SQL query function class can come to the right results.Then, this paper makes a contrast between GBase 8a and TimesTen Decimal types by operating on Decimal types. From the test results, we can see that in the function of searching and aggregation, the two databases almost have the same performance, in the aspects of data import; TimesTen's performance is slightly higher, while in the data and the ORDER table to sort the connection, the GBase 8a plays better. Hence, GBase 8a-memory database will have a certain degree of market competitiveness. However, the performance of fixed-size storage Decimal almost have two times the performance of var-size storage Decimal, so it is to be improved to store Decimal in var-size.Finally, this paper shows the problems to be improved in the current database version, including main-memory management and disk-synchronization. On the aspect of main-memory management, we can reduce the access time of main-memory by improving the hit rate of cache, and we can also improve the concurrency of data access to utilize multi-CPU fully. On the aspect of disk-synchronize, the current mechanism has some disadvantages: when the data scale is out of memory capacity, the performance of database become worse and worse because of this mechanism. Therefore, there's the need to improve the existing mechanism, the scale achieved when the data reaches the memory capacity several times the performance of the disk database.Above all, the current main-memory database management system has achieved a powerful and essential performance to meet the need of real-time requisition, but there is still potential for improvement. With the hardware capacity of the growth of main memory, as well as the CPU architecture of the constantly changing, the optimization of storage management strategy will also change, to enhance performance will become increasingly.
Keywords/Search Tags:Main Memory Database, Column-storage, Storage Management
PDF Full Text Request
Related items