Font Size: a A A

Prefetching Algorithms In Linux Kernel

Posted on:2009-05-12Degree:DoctorType:Dissertation
Country:ChinaCandidate:F G WuFull Text:PDF
GTID:1118360275455409Subject:Control theory and control engineering
Abstract/Summary:PDF Full Text Request
The ever-increasing gap between CPU and I/O speed has created the most outstanding imbalance in computer systems.The improvement of disk throughput has been unable to keep up with the rapidly speeding-up CPU,and the I/O delay has been lagging behind seriously.Prefetching is a vital technique for improving disk I/O performance.It can transform small synchronous I/O requests from applications into large asynchronous readahead I/Os,so that I/O delays could be hid from applications,and I/O could be parallelized.Sequential prefetching has been a standard function in modern operating systems. The Linux kernel implemented a generic file readahead algorithm in its virtual file system layer.As Linux is widely deployed and runs an increasing variety of workloads, its in-kernel readahead algorithm has been challenged by many unexpected and subtle problems.To name a few:readahead thrashings arise when readahead pages are evicted prematurely under memory pressure;readahead attempts on already cached pages are undesirable;interrupted-then-retried reads can easily fool the sequential detection logic. In this thesis,we present a demand prefetching framework.By introducing a pair of readahead triggers,it can handle most abnormal cases in a natural and uniform way.The modular design simplifies the pattern detection and readahead logics,and makes them independently improvable.With the trend towards multi-core and multi-thread processors,software developers multi-thread their applications and bring forward more parallel I/Os.It is a big challenge for the prefetching algorithm to efficiently recognize and handle the interleaved sequential read streams in parallel I/O workloads.On the basis of the demand prefetching framework, we designed and implemented a reactive sequential detection algorithm and a proactive one,which feature the use of page status as a complement to the less reliable readahead states,as well as relaxed sequentiatity criterion.They can support various semi-sequential access patterns:sequential streams mixed in random accesses;interleaved read pattern created by concurrent sequential accesses to one file descriptor;sparse sequential reads; locally reordered sequential NFS reads;high density random read areas;row iterations on huge arrays,etc.The reactive algorithm brings no overhead to the classical single thread case;for parallel I/Os the space and time complexities are trivial and irrelevant to the number of concurrent streams.The proactive algorithm is equipped with an efficient sequential detection logic,which guarantees the detection of sequential streams and hot random read areas.Its adaptive readahead size can prevent readahead thrashing and hence achieve better performance.Experiments show that the proposed algorithms perform much better than legacy Linux readahead:Network throughput is 3~4 times better in the case of thrashing; Sequential reads intermixed with random ones are faster by 29%;I/O throughput of interleaved streams could be 4~27 times better,while the application visible I/O latencies are improved by up to 35 times;On serving large files via NFS,throughput could be 1.8 times better;On serving large files with lighttpd,the disk utilization is decreased by 26%while providing 17%more network throughput;On randomly populating a file,the proactive algorithm performs 3 time better than the legacy one.Our prefetching algorithms have real-world relevance,among them the demand prefetching framework and reactive sequential prefetching algorithm has been adopted by Linux kernel 2.6.23 and 2.6.24.
Keywords/Search Tags:Linux, operating system, I/O performance, file prefetching, parallel I/O, access pattern, sequential detection, page cache, readahead thrashing
PDF Full Text Request
Related items