Computer Science Department
School of Computer Science, Carnegie Mellon University


Simple Cache Partitioning for Networked Workloads

Thomas Kim, Sol Boucher, Hyeontaek Lim,
David G. Andersen, and Michael Kaminsky*

October 2017


Keywords: Cache partitioning, isolation, resource utilization, tail latency, multi-tenancy

Modern datacenters often run a mix of latency-sensitive and throughput-oriented workloads, which are consolidated onto the same physical machines to improve resource utilization. Performance isolation under these conditions is an important component in satisfying service level objectives (SLOs). This paper examines cache partitioning as a mechanism to meet latency targets. Cache partitioning has been studied extensively in prior work; however, its effectiveness is not well understood for modern datacenter applications that enjoy recent improvements in low-latency networking, such as stack bypass and faster underlying technologies (e.g., DPDK, InfiniBand). These applications have tight tail latency SLOs of tens to hundreds of microseconds. We find that cache partitioning is effective in achieving performance isolation among diverse workload mixes for such environments, without requiring complex configurations or online controllers. On our modern multi-core server using Intel Cache Allocation Technology, we show that cache partitioning can protect the performance of latency-sensitive networked applications from local resource contention. Our code is publicly available on Github

28 pages

*Intel Labs

Return to: SCS Technical Report Collection
School of Computer Science

This page maintained by