Font Size: a A A

Research And Implementation Of A Large Scale Time-series Data Storage System

Posted on:2014-06-12Degree:MasterType:Thesis
Country:ChinaCandidate:J G WangFull Text:PDF
GTID:2268330422463257Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
RRDtool is currently a very popular file-based database used to store time-series data.However, the performance of an RRDtool-based storage system is quite poor in dealingwith a large quantity of RRD files that need to be updated due to the operating system’sreadahead and buffer-cache behaviors, which will result in limited scalability of thesystem: tens of thousands of, or perhaps one hundred thousand of RRD files in a singlesystem. Another challenge is the flexibility of the system’s capacity which stores a rapidlyincreasing number of RRD files. Moreover, it is significant and essential to keep thesystem highly available regardless of component or system failures.In this thesis, a storage system which combines mem-RRD and MooseFS for largescale time-series data is investigated and implemented in response to the issues mentionedabove. Mem-RRD is designed to replace the original RRDtool, it exploits user-levelbuffering and performs better on the aspect of I/O. MooseFS is a distributed file systemwhich guarantees high availability and flexible capacity. The system is built and deployedin a network measurement circumstance and its effectiveness is demonstrated by detailedtesting and observation. Briefly speaking, this large scale time-series data storage systemprovides good performance including I/O performance, availability and scalability.
Keywords/Search Tags:Time-series data, mem-RRD, MooseFS, I/O performance, Availability, Scalability
PDF Full Text Request
Related items