|
CMU-CS-02-115
Computer Science Department
School of Computer Science, Carnegie Mellon University
CMU-CS-02-115
Fractal Prefetching B+-Trees:
Optimizing Both Cache and Disk Performance
Shimin Chen, Phillip B. Gibbons*, Todd C. Mowry, Gary Valentin**
March 2002
CMU-CS-02-115.ps
CMU-CS-02-115.pdf
Keywords: Cache performance, I/O performance, prefetching,
databases, B+-Tree index
B+-Trees have been traditionally optimized for I/O
performance with disk pages as tree nodes. Recently, researchers
have proposed new types of B+-Trees optimized for CPU
cache performance in main memory environments, where the tree
node sizes are one or a few cache lines. Unfortunately, due
primarily to this large discrepancy in optimal node sizes,
existing disk-optimized B+-Trees suffer from poor
cache performance while cache-optimized B+-Trees exhibit poor
disk performance. In this paper, we propose fractal prefetching
B+-Trees (fpB+-Trees), which embed
"cache-optimized" trees within "disk-optimized" trees, in order to
optimize both cache and I/O performance. We design and evaluate two
approaches to breaking disk pages into cache-optimized nodes:
disk-first and cache-first. These approaches are
somewhat biased in favor of maximizing disk and cache performance,
respectively, as demonstrated by our results. Both implementations
of fpB+-Trees achieve dramatically better cache
performance than disk-optimized B+-Trees: a factor of
1.1 - 1.8 improvement for search, up to a factor of 4.2 improvement
for range scans, and up to a 20-fold improvement for updates. In
addition, fpB+-Trees accelerate I/O performance for
range scans by using jump-pointer arrays to prefetch leaf pages,
thereby achieving a speed-up of 2.5-5 on IBM's DB2 Universal Database.
27 pages
*Information Sciences Research Center, Bell Laboratories, 600 Mountain
Avenue, Murray Hill, NJ 07974;
Current affiliation: Intel Research Pittsburgh, 417 South Craig Street, Pittsburgh, PA 15213.
**IBM Toronto Lab, 8200 Warden Avenue, markham, Ontario, L6G 1C7, Canada
|