Font Size: a A A

Improving file system performance with file access predictions

Posted on:2003-05-05Degree:Ph.DType:Dissertation
University:University of California, Santa CruzCandidate:Yeh, TsozenFull Text:PDF
GTID:1468390011485915Subject:Computer Science
Abstract/Summary:
Hard disk drives (HDDs) have been a very important component of computer systems. The history of HDDs dates back to the early 1950s. IBM introduced its first production hard disk, RAMAC (Random Access Method of Accounting and Control), in 1956. Disk technology has improved tremendously in terms of both areal density and speed in recent years. However, compared with CPU, disk speed is still about six orders of magnitude behind, making disk operations a primary performance bottleneck in modern computer systems.; This research describes predicting future file accesses and prefetching needed files into cache memory. By doing so, the system can avoid much of the performance penalty of disk operations. We present a new model, Program-Based Successor (PBS), to predict what upcoming files running programs may need.; Files are accessed by programs, and consecutive accesses of different files are largely decided by the execution of different programs. In other words, if we know which program initiates the current file access, we have a better chance of successfully predicting what files that program will need next. PBS uses this knowledge to perform file access prediction. Through a series of trace-based simulation experiments, we show that compared with the common benchmark, Last Successor (LS), PBS makes about 25% to 40% fewer incorrect predictions than LS, and makes the same number or more correct predictions than LS. As a result, the overall system performance penalty from disk operations can be significantly reduced.; We also apply PBS to the area of file grouping. File grouping is concerned with placing related files physically close to each other on the disk. When the system reads files from the disk, the disk head first needs to move to the corresponding location to retrieve data, which significantly prolongs the disk latency. If we know an access to one file is likely to be followed by an access to another file (or files), then grouping those related files on the disk could reduce the total distance the disk head needs to travel to access those files, which means the time required to retrieve data from the disk can be reduced. Our results on a simplified disk model show that, with PBS grouping, the total distance that the disk head needs to travel could reduce by 65% the total distance that the disk head has to travel when no grouping is applied.; As the speed of CPUs gets faster and faster, disk operations can hurt the system performance more than they did before. With the idea of “program-based”, PBS has shown its promise to reduce the negative impact of disk operations by file prediction and file grouping.
Keywords/Search Tags:Disk, File, System, PBS
Related items