Font Size: a A A

Content addressable data management

Posted on:2008-05-02Degree:Ph.DType:Thesis
University:The Pennsylvania State UniversityCandidate:Nath, ParthoFull Text:PDF
GTID:2448390005976178Subject:Computer Science
Abstract/Summary:
A direct implication of both the industry and academia proclaiming the Age of Tera-(even the Peta)-scale computing, is that applications have become more data intensive than ever. The increased data volume from applications tackling larger and larger problems has fueled the need for efficient management of this data. In this thesis, we evaluate a technique called Content Addressable Storage or CAS, for managing large volumes of data. This evaluation focuses on the benefits and demerits of using CAS for, (i) improved application performance via lockless and lightweight synchronization of accesses to shared storage data; (ii) improved cache performance; (iii) increase in storage capacity; and, (iv) increased network bandwidth. We present the design of a CAS-based file store that significantly improves the storage performance providing lightweight and lock-less user-defined consistency semantics. As a result, our file-system shows a 28% increase in read-bandwidth and a 13% increase in write bandwidth, over a popular file-system in common use. We use the same experimental file-system to analyze CAS on data from real world application benchmarks. We also estimate the potential benefits of using CAS for a virtual machine based user mobility application, that was in active use at a public deployment for over a period of seven months.
Keywords/Search Tags:Data, Application, CAS
Related items