We identify the root causes of this inability to support high-density VM scenarios to be (i) high runtime overheads and (ii) unpredictable scheduling heuristics. To better support high VM densities, we propose Tableau, a VM scheduler that guarantees a minimum processor share and a maximum bound on scheduling delay for every VM in the system. Tableau achieves this by combining a low-overhead, core-local, table-driven dispatcher within the hypervisor with a fast on-demand table-generation procedure (triggered asynchronously upon VM creation and teardown) that employs scheduling techniques typically used in hard real-time systems.
In an evaluation comparing Tableau against three current Xen schedulers on a 16-core Intel Xeon machine, Tableau is shown to improve both tail latency (e.g., a 17x reduction in maximum ping latency compared to Credit, Xen's default scheduler) and throughput (e.g., 1.6x peak web server throughput compared to Xen's real-time RTDS scheduler when serving 1 KiB files with a 100 ms SLA).
While Tableau solves one piece of the unpredictability puzzle, namely the VM scheduler, there are other sources of unpredictability that arise in a shared, high-density setting. We therefore propose extensions of Tableau to deal with two other major sources of unpredictability: LLC interference caused by other VMs co-located on the same CPU socket, and delays that arise due to I/O scheduling.