Font Size: a A A

Distributed In-Memory Hybrid Index For Big Data Stream

Posted on:2018-12-15Degree:MasterType:Thesis
Country:ChinaCandidate:J J XiangFull Text:PDF
GTID:2348330518974802Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the development of information and communication technology(ICT),data acquisition becomes more and more convenient.The processing technology for data stream has been widely used in the various fields,such as industry and agriculture monitoring,communication,financial analysis,Internet of things and so on.The data stream has the characteristics of real-time,easy to lose,infinite,sudden,disorder and so on.Due to the high speed of data flow and the large amount of data,the system needs to deal with a large amount of data in a short time.Efficient storage and indexing for large data streams is a challenging research topic in database.In this thesis,a distributed in-memory B+tree index for data streams is proposed.This index uses the time window mechanism,based on the two level B+tree index.The data stream is cut into continuous time window,and each time window build an inner B+tree.The formation of <key,value>,which the key comes from the time stamp of the time window,and which the value is composed of the root node of the inner B+tree and the root node address,used to construct the outer B+ tree.The main contributions of this thesis are as follows:1.This thesis proposes a method for constructing the inner B+ tree named MBSort SBLoad.This method is fast and low latency,but it can't provide near real-time query in the process of cache tuple.2.This thesis presents a method for constructing the inner B+ tree named MBSort MBLoad.With the construction speed,this method also can provide near real-time query.3.This thesis describes a distributed in-memory B+tree index system for streaming data,realizing the distributed and efficient storage of large data streams,providing indexing service with high concurrency and low latency.This method solves the problem that the traditional B+tree can't keep the data stream updated frequently.The effectiveness of the proposed method is verified by experiments.
Keywords/Search Tags:B+ Tree, data stream, distributed, In-Memory index, big data
PDF Full Text Request
Related items