Font Size: a A A

Research And Implementation Of Data Filtering Optimization Technology Based On FPGA

Posted on:2021-04-11Degree:MasterType:Thesis
Country:ChinaCandidate:Y LiFull Text:PDF
GTID:2428330623968148Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In recent years,with the development of the Internet,mass data storage and big data technology have been developed rapidly,and the improvement of data volume makes CPU performance in bottleneck.In order to deal with this problem,coprocessor is widely used to solve the query efficiency in mass data storage.Because of its parallel and pipeline characteristics,the programmable logic device(FPGA)has been widely concerned in improving the query efficiency of the system.The purpose of this thesis is to study how to improve the overall data query and filtering performance of the system by using FPGA in the big data architecture with the separation of storage and calculation.Firstly,based on the different file formats of big data systems,this thesis puts forward two kinds of query filtering schemes in TextFile and RCFile storage formats.The parallel and pipeline characteristics of FPGA are full used to improve the query efficiency,we will integrate the two query filtering schemes in different file formats on FPGA board.Experiments show that the query performance is improved by 62% for textfile format and 66% for RCfile data,compared with traditional ways.Secondly,this thesis proposes a cpu-fpga aRChitecture based on time prediction,and applies it to the big data system with the separation of storage nodes and computing nodes.This method provides two ways of data forwarding.One is the traditional method,the CPU directly forwards the data to the computing node.The other is the FPGA pre-filter and then forwards the filtered data to the computing nodes,which can greatly reduce the data transmission size and communication overheads.The data forwarding method is determined by the time prediction algorithm,that is,each data filtering task is more suitable to be processed by CPU or FPGA.Through the optimal selection of the time prediction algorithm,the data query and filtering performance of the whole big data system can be improved.In the CPU-FPGA architecture based on time prediction,the filtered data of FPGA needs to be forwarded to the network through CPU,which leads to a certain communication waste.In order to solve this problem,this thesis proposes a design scheme of NIC-QF with filtering function based on FPGA.In this design scheme,FPGA exists in the form of invisible coprocessor,which integrates the data filtering function of coprocessor and data forwarding function of network interface card.Through this way,the CPU does not feel the existence of FPGA,and the upper software does not need to make too much modification.We can only replace the traditional network card with NIC-QF.The storage node forwards the data to the computing node.When the data passes through NIC-QF,the data is queried and filtered first on NIC-QF,which can effectively reduce the data size that needs to be forwarded to the network,and reduce the computing pressure of the computing nodes.The experiment shows that the system using NIC-QF reduces 56% of the time overheads compared with the system using ordinary NIC.And this improvement efficiency is closely related to the amount of filtered data.With the increase of the amount of filtered data,the improvement efficiency will be more obvious.Finally,the architecture of intelligent storage engine based on FPGA is proposed for MongoDB.The intelligent storage engine,which combines FPGA and SSD,is used to store thermal data data and reduce memory consumption.And for document data,a parameterized data query filtering scheme is designed.Using pipeline technology,<key,value> in documents is filtered in parallel.Experiments show that our system design is 56% faster than the original MongoDB.
Keywords/Search Tags:FPGA, big data system, query filtering acceleration, file storage formats, CPU-FPGA architecture
PDF Full Text Request
Related items