Computer Science Department
School of Computer Science, Carnegie Mellon University
Near-Real-Time Inference of File-Level
Wolfgang Richter, Mahadev Satyanarayanan
We describe a new mechanism for cloud computing enabling near-real-time monitoring of virtual disk write streams across an entire cloud. Our solution has low IO overhead for the guest VM, low latency to file-level mutation notification, and a layered design for scalability. We achieve low IO overhead by duplicating the virtual disk write stream as it passes through a managing VMM. We achieve low latency by performing semantic inference at as high a level as possible–file-level. We achieve cloud scale by layering our design allowing filtering of file-level mutations by each layer such that network traffic to centralized monitoring infrastructure is minimized. We assume this technique is used on pre-indexed virtual disks, most likely derived from a cooperating VM image library such as those used in clouds today. Our new cloud primitive enables system administration tasks that involve monitoring files–virus scanning, log file parsing, etc.–to be performed outside of the running VM instance, either on the VMM host, or shipped to a central monitoring agent.