Font Size: a A A

Research On Techniques Of Optimizing Data Storage And Query Based On MongoDB

Posted on:2017-03-26Degree:MasterType:Thesis
Country:ChinaCandidate:L QiFull Text:PDF
GTID:2308330488497123Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the explosive growth of Internet data, massive data storage and query to the data center has brought great challenges. As a novel type of non-relational database, MongoDB is widely used by the advantages of the flexible data storage format and high query performance. However, MongoDB lacks of data detection and deduplication mechanism, resulting in storage space consumption. Since the redundant data increases, considerable storage space has been wasted. Meanwhile, the performance of the database storage system comes down.For coping with the issue of duplicate data detection, data fingerprint can be used as the unique credential of data. But the data fingerprint query efficiency probably becomes the bottleneck of MongoDB system in the situation of massive data deduplication. In this thesis, the characteristics of two type algorithms, named tree and hash fingerprint data query algorithm, are explored under the situation of duplicate data detection. Besides, the Bloom filter as one of high performance hash query algorithm is also discussed. Accordingly, Expanding Bloom Filter query algorithm that aimed to settle the expansion of Bloom Filter is proposed that would be applied to massive data deduplication. Based on the storage mechanism of Grid FS file storage system in MongoDB, fingerprint mapping table is created for searching the existence of block data, and the goal of block level data deduplication is finally achieved.Experimental results show that the improved file system of MongoDB using Expanding Bloom Filter shows better scalability when data detection and deduplication technique is implemented. Compared with the original bloom filter, the query performance of the time is increased.
Keywords/Search Tags:MongoDB, Data Detection, Query, Bloom Filter
PDF Full Text Request
Related items