Modeling file access patterns to improve caching performance

Posted on:2001-01-02

Degree:Ph.D

Type:Thesis

University:University of California, Santa Cruz

Candidate:Kroeger, Thomas M

Full Text:PDF

GTID:2468390014957325

Subject:Computer Science

Abstract/Summary:

Computers are predictable---they are driven by sequences of instructions that occur in repeated, non-random patterns. These instructions drive the order in which files are accessed. As a result, the patterns in which files are accessed offers information that can predict upcoming file accesses. Most modern caches ignore these patterns and fail to use information that enables significant reductions in I/O latency. While prefetching heuristics that expect sequential accesses are often effective methods to reduce I/O latency, they cannot be applied to inter-file accesses, because the abstraction of a file has no intrinsic concept of a successor. This limits the ability of modern file system caches to prefetch.; Our thesis is that there is a strong correlation between individual file accesses and the patterns in which files are accessed. These patterns can be modeled efficiently and used to reliably predict upcoming requests. A predictive cache that prefetches based on these models can prefetch where sequential heuristics are unable, improving file system cache performance and reducing I/O latencies.; This dissertation presents our work in adapting text compression techniques to model file access patterns and facilitate predictive prefetching. We examine several previously used models and present two new techniques called Partitioned Context Modeling (PCM) and Extended Partitioned Context Modeling (EPCM) that adapt to changing patterns and are space efficient. Using traces of file system activity, we show that PCM can predict the next event with a high degree of accuracy, and that prefetching based on such a model can significantly reduce cache misses. We have implemented predictive prefetching with the Linux kernel and show that these techniques can result in phenomenal reductions in I/O latencies. We tested our implementation with three different application-based benchmarks and reduced I/O latency as much as 92%. We then examine the potential for such predictive prefetching in the Web. Using traces of Web activity we show that prefetching is a significant factor in latency reduction. Through our research into predictive prefetching, we show that by modeling a computer system's previous events we can provide an accurate indicator of upcoming events and offer the operating system more complete information enabling it to better manage the system's resources.

Keywords/Search Tags:

Patterns, File, I/O latency, Modeling, Predictive prefetching, System

Related items

1	Shortening The Read Latency Of Ceph Through Improving The Client-side Data Cache And Read Request Scheduling Scheme
2	Research On Cache Prefetching Based On Ceph File System Client
3	Massive Global Subdivision Tile Files Encoded Orderly Storage And Prefetching Technology Research
4	Prefetching Algorithms In Linux Kernel
5	Prefetching And Data Percolation Technique For Applications On Many-core Architecture
6	Research On File Fragmentation And Data Prefetching Strategy In Video-on-Demand System Based On P2P-CDN
7	Web Prefetching And Caching The Integrated Model
8	Research On Web Prefetching Model Based On Double Dependency Graph
9	Research On Prefetching And Caching Technologies Of Parallel Disks System
10	Enhanced file prefetching for delivery on demand