Font Size: a A A

Real-time Storage Technology For Data Stream

Posted on:2016-10-21Degree:MasterType:Thesis
Country:ChinaCandidate:B ChenFull Text:PDF
GTID:2308330464969336Subject:Computer technology
Abstract/Summary:PDF Full Text Request
A new type of data named data stream has emerged in fields such as industrial monitoring, transportation and Internet of Things. In addition to the 4V features of big data, big data stream is also continuous, real-time and unbounded. Traditional RDBMS has serious deficiencies when processing data stream. Therefore, Big Data Stream Management Systems have become hot research issues in academia and industrial community. Storage subsystem is a core module of Big Data Stream Management Systems, and index is a key component in storage subsystem. This thesis focused on index construction and storage persistency in Big Data Stream Management Systems.The contributions of the thesis are as follows: 1. An index bulk construction scheme for data stream is proposed. Based on the characteristics of data stream as well as existing bulk loading and bulk insertion techniques for B+ tree, we proposed a bulk construction scheme for B+ tree which combines sort-based bulk loading and bulk insertion techniques, and analyzed the relationship between parameters setting in bulk construction and data stream speed. 2. A strategy of clustered storage for data stream is proposed. The relationship between the parameters of persistent clustered storage strategy and data stream speed is analyzed. The maximum throughput of the persistent clustered storage strategy is calculated.We have implemented proposed algorithms, and finished corresponding experiments. Experiment results verified the validity of proposed bulk construction scheme for B+ tree and persistent clustered storage strategy.
Keywords/Search Tags:data stream, B+tree, bulk building, persistent storage
PDF Full Text Request
Related items