Computer Science Department
School of Computer Science, Carnegie Mellon University


Near-Real-Time Inference of File-Level
Mutations from Virtual Disk Writes

Wolfgang Richter, Mahadev Satyanarayanan
Jan Harkes, Benjamin Gilbert

February 2012


Keywords: Block write, cloud, cloud computing, file-level, inference, introspection, kernel virtual machine, KVM, monitoring, near-real-time, real-time, semantic, virtual disk, virtual disk write, virtual machine, VM, virtual machine introspection, VMI, virtual machine monitor, VMM

We describe a new mechanism for cloud computing enabling near-real-time monitoring of virtual disk write streams across an entire cloud. Our solution has low IO overhead for the guest VM, low latency to file-level mutation notification, and a layered design for scalability. We achieve low IO overhead by duplicating the virtual disk write stream as it passes through a managing VMM. We achieve low latency by performing semantic inference at as high a level as possible–file-level. We achieve cloud scale by layering our design allowing filtering of file-level mutations by each layer such that network traffic to centralized monitoring infrastructure is minimized. We assume this technique is used on pre-indexed virtual disks, most likely derived from a cooperating VM image library such as those used in clouds today. Our new cloud primitive enables system administration tasks that involve monitoring files–virus scanning, log file parsing, etc.–to be performed outside of the running VM instance, either on the VMM host, or shipped to a central monitoring agent.

21 pages

Return to: SCS Technical Report Collection
School of Computer Science

This page maintained by