|
CMU-CS-97-204
Computer Science Department
School of Computer Science, Carnegie Mellon University
CMU-CS-97-204
Informed Prefetching and Caching
Russel Hugo Patterson III
December 1997
Ph.D. Thesis [Department of Electrical and Computer Engineering]
CMU-CS-97-204.ps
CMU-CS-97-204.pdf
Keywords: Prefetching, caching, file systems, resource management,
cache management, TIP, I/O, cost-beneift analysis, economic markets, disk
arrays, RAID
Disk arrays provide the raw storage throughput needed to balance
rapidly increasing processor performance. Unfortunately, many
important, I/O-intensive applications have serial I/O workloads that
do not benefit from array parallelism. The performance of a single
disk remains a bottleneck on overall performance for these
applications. In this dissertation, I present aggressive, proactive
mechanisms that tailor file-system resource management to the needs of
I/O-intensive applications. In particular, I will show how to use
application-disclosed access patterns (hints) to expose and exploit
I/O parallelism, and to dynamically allocate file buffers among three
competing demands: prefetching hinted blocks, caching hinted blocks
for reuse, and caching recently used data for unhinted accesses. My
approach estimates the impact of alternative buffer allocations on
application elapsed time and applies run-time cost-benefit analysis to
allocate buffers where they will have the greatest impact. I
implemented TIP, an informed prefetching and caching manager, in the
Digital UNIX operating system and measured its performance on a 175
MHz Digital Alpha workstation equipped with up to 10 disks running a
range of applications. Informed prefetching on a ten-disk array
reduces the wall-clock elapsed time of computational physics, text
search, scientific visualization, relational database queries, speech
recognition, and object linking by 10-84% with an average of 63%. On
a single disk, where storage parallelism is unavailable and avoiding
disk accesses is most beneficial, informed caching reduces the elapsed
time of these same applications by up to 36% with an average of 13%
compared to informed prefetching alone. Moreover, applied to
multiprogrammed, I/O-intensive workloads, TIP increases overall
throughput.
260 pages
|