Although virtualization has eased the day-to-day lives of IT administrators, analyzing the performance of virtualized infrastructure remains very difficult. Admittedly, collecting statistics for a large number of servers is not a new problem. Neither is analyzing the virtualization performance of an individual virtual machine or the performance of a multi-tier virtualized application. The challenge lies in building tools to enable all three to occur at the same time and provide meaningful information for end users, many of whom are not virtualization experts.
This paper describes a scalable performance monitoring framework for virtualized environments. It first considers the pain points experienced by customers trying to do performance troubleshooting in virtualized environments. The common themes present in most customer environments include the need to gather granular data in a scalable way and to correlate data across layers, from the application to the guest operating system to the physical host. Using these requirements to guide design, the paper describes a prototype for scalable statistics collection that leverages currently existing statistics-gathering APIs and attempts to address these use cases. It describes how the authors were able to use this design to address several real-world troubleshooting use cases. It concludes with results demonstrating the scalability of the approach for typical virtualized datacenter sizes.
Vijayaraghavan Soundararajan, Balaji Parimi, Jon Cook