Font Size: a A A

Codegeneration Technology In Column-Store In-Memory Database

Posted on:2020-04-18Degree:MasterType:Thesis
Country:ChinaCandidate:X H TengFull Text:PDF
GTID:2428330596975075Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In such an era that information technology develops rapidly,there's big desire for store and query of information.Especially when all kinds of information in realistic society can be transformed into electronic signals,databases play an important role in data processing.How to speed up query in databases becomes an important topic in database design and implementation.Database brings many redundant calculations from complex type system and interpreted execution model,which leads to low efficiency and poor use of charateristics in modern hadware.In addition,traditional compiler can only generate binary code for a program before running,which can't get rid of databaes' uncertainty.If dynamic information of databases can be used to reduce unused instructions,improvement in execution effiency can be reached.This paper talks about the technology of generating efficient binary code with help of JIT compiler,which makes use of operators' calculation logic and dynamic informations in databases.The paper doesn't pay attention to the way which transforms a whole query tree into one function,because in distributed systems,column-store engine will make the improvement no use.Instead,dynamic code generation will be used only within operators.With a JIT compilation library packaged by Impala data engine acting as the underlying support components,this paper introduces the implemention of compilied execution in order,group and expression calculator.These operators come from Goldfish,a column-based-in-memory databse.In addition,adaptive code generation according to row number is proposed in this paper.The thesis introduces how to achieve high execution efficiency at runtime for order,group and aggregate operator with code generation in detail,unit tests have been used to verify the new operators with code generation.Finally,it is found that more tuples bring larger improvement,with order and group getting serveral times performance improvements.Also,Aggregate operator gets benefits from codegen generation to some extent.With the use of a performance detection tool,is is found that less instructions are genenrated in new operators compared to old ones without help of code generation,and the peak memory footprint is also lower in the case of the former.Therefore,code generation brings improvements not only in time but also in space used by databases.However,code generation also takes up time itself,which will reduce the program's performance.Therefore,during actual operation,a threadhold of row number should be set to make sure whether code generation would be used this time.
Keywords/Search Tags:Just-in-time compilation, In-Memory database, Column store
PDF Full Text Request
Related items