VMware Academic Program
Committed to strengthening VMware’s relationship with the academic and research communities.

RDMA Performance in Virtual Machines using QDR InfiniBand on VMware vSphere 5

Download PDF

RDMA Performance in Virtual Machines using QDR InfiniBand on VMware vSphere 5

While high bandwidth, low latency communication has long been important for High Performance Computing, it is now becoming increasingly important in modern, scale-out Enterprise and Cloud environments. This research report from the VMware Office of the CTO reports performance results using Mellanox QDR InfiniBand in Red Hat guests with VM DirectPath I/O (passthrough mode). We demonstrate that bandwidths delivered over a wide range of message sizes closely match those achievable in non-virtualized environments and, in addition, show that latencies under two microseconds can be achieved.

Authors

Josh Simons, Adit Ranadive, Bhavesh Davda

3 thoughts on “RDMA Performance in Virtual Machines using QDR InfiniBand on VMware vSphere 5

  1. Giovanni Lanaro

    Hello,
    We tried to resimulate your achivements with a QDR Fabric based on
    MT26428 HCA and MT3600 Switch with ESXi 5 latest patch (build 702118) on Supermicro System SYS6026T-NTR+ latest BIOS from web site.

    We have Intel-VT enabled, VT-d enabled, enabled PCI passthrough for
    MT26428, assigned MT26428 to Centos 6.2 virtual machine, installed OFED.

    After compiling OFED 1.5.4.1, we get the following warning about
    MT26428 PCI express link:

    Device (15b3:673c):0b:00.0 InfiniBand: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0 5GT/s - IB QDR / 10GigE] (rev b0)
    Link Width is not 8x
    PCI Link Speed: 5Gb/s

    Of course this is not the case with native Centos install without PCI
    passthough.

    This has an impact on performance. Is there any specific arrangement
    that needs to be done for exposing x8 link width ?

    Thank you very much for you articles

    Reply
    1. Bhavesh Davda

      Hello Giovanni,

      Sorry about the delay: didn’t notice your post until right now.

      We emulate much of the PCI configuration space of passthrough devices, including the PCI express link width. We advertise a PCIe link width of 32x for all passthrough devices.

      That said, I don’t expect the link width advertised in the PCIe configuration space of the device to actually impact performance in terms of either throughput/bandwidth or latency, *unless* the Mellanox driver that’s part of the OFED 1.5.4.1 stack makes certain decisions to limit throughput based on the PCIe link width, which would be quite unusual.

      Reply
      1. Giovanni Lanaro

        Hello,
        after a long debugging with Mellanox it was discoverered it is a MLX OFED driver misleading message: it says “Link width is not 8x”, but in reality link with is 8x (this as per Mellanox support)

        Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>