VMware Inc., and Georgia
Institute of Technology
In a software-defined data center, network services are defined and developed as virtual and physical appliances that are integrated into the data center to customize its functionality. Network services preprocess network packet traffic before it arrives at the destination virtual machine (VM) or post-process the traffic after it leaves the source VM. Such pre- and postprocessing enables the specified network policies—such as intrusion detection, data compression, and load balancing—to be executed within the data center, and they maintain a consolidated and secure network environment for resource sharing. Because network services need to coexist with the data center internal infrastructure for data processing, network services debugging and service-rule verification are difficult problems if the network packets are invisible in the virtualization infrastructure. Network-Service-Extensibility (NetX) is a network extensibility solution that enables VMware partners to integrate their service VMs into the VMware ecosystem. NetX requires the presence of partner-inserted traffic filter rules that redirect network IP packets to and from the partner’s service VMs. We have developed an IP-packet tracing tool (PktTrace) to track IP packets and to verify the behavior of these rules in the data center internal infrastructure. PktTrace tracks IP packets at various points in the data path. It also detects possible packet loss and measures latency. Such tracking exposes the details of networking traffic for service debugging and verification of these packet-redirect rules. PktTrace help us better understand networking behavior in the software-defined data center. It also enables a quick measurement of the latency added by a network service. A security service example measurement is provided in the paper to demonstrate the capabilities of PktTrace. Our work with PktTrace verifies the rules for packet redirect to a partner service in the virtualization infrastructure and provides additional information about performance.
Keywords: software-defined data center, software-defined networking, virtualization, debugging, packet tracking
Cloud computing has emerged as a new paradigm for consolidating applications and sharing resources in a common infrastructure. The data center is used as the physical place to host cloud services and incarnate such consolidation. The software-defined data center (SDDC)  is better used for hosting cloud services, because of their high configuration and maintenance cost and low consolidation and scalability performance. To change this, virtualization is growing as an important building block for transforming the traditional data center into a SDDC. The SDDC fully virtualizes all of its resource sharing to meet the performance and reliability demands of cloud computing. This change has also been extended to network infrastructure and service sharing.
Network virtualization is widely used in the SDDC. Such virtualization aims to provide separate logical networks on a shared physical infrastructure . In conventional networking, data, control, and management planes are implemented in the firmware, which limits the network resource sharing in the data center . Software-defined networking (SDN)  is an emerging network architecture in which the data plane, control plane, and management plane are decoupled and implemented in software instead. Such software-defined planes enable the underlying network infrastructure to be abstracted and virtualized as independent entities. L4–L7 networking services are also moving away from being hosted on special hardware to VMs and are being treated as virtual entities. This migration enables the data center to gain much more flexibility in its network control. In the SDDC, network services are hosted in the service VM (SVM).
With the deployment of a SVM, network services enable the network data to be preprocessed before it arrives at the destination VM and postprocessed after it leaves the source VM. Using L4–L7 network services, the data center can easily configure and enforce any network policy, such as intrusion detection, data compression, and load balancing. Such network policies maintain a consolidated and secure network environment for resource sharing in the cloud. Because network services are mainly developed and provided by VMware partners rather than the virtualization infrastructure providers, VMware has built an integrated infrastructure to enable the partners to integrate their services into the SDDC. Figure 1 shows the overview of a SDDC architecture. In Figure 1, the SDDC hosts two kinds of VMs: protected VM (PVM) and SVM. PVMs host the user applications, and SVMs provide network isolation and protection services for the PVMs that request the services. The data plane is hosted in the virtualization infrastructure to deliver and forward the network IP packets based on a set of rules. A data plane agent is correspondingly needed in the SVM to handle these IP packets from the data plane in the virtualization infrastructure (kernel). After the services are deployed in the SDDC, some packet routing rules with actions—such as forwarding, dropping, and punt to the SVM—need to be inserted into the data plane. Any incoming or outgoing IP packet is checked by the data plane in the virtualization infrastructure. If the rule is hit for an IP packet, then the defined actions take effect on that IP packet, and the IP packet is forwarded to the corresponding network service to process.
After the network service finishes the IP packet processing, the IP packet is sent back onto the data path to continue its journey. During this process, network services work with the data plane to divide the packet data path processing into two phases—one in the virtualization infrastructure and the second in the SVM. These two phases should be visible when the network services are being debugged and some unexpected network behavior happens. Therefore, network extensibility should not only work on services deployment and management, but also track the IP packets in the data plane. This helps the partner in its service debugging, especially when there is packet loss.
In this paper, we explore VMware NetX and develop an IP packet tracing tool (PktTrace) to complement its functionality. PktTrace tracks IP packets at various points in the data path to detect possible packet loss and processing latency. With PktTrace, we can better understand network behavior and help partners debug their services in the SDDC.
PktTrace makes the following contributions:
- It tracks IP packets at various points in the packet flow based on a given set of rules in the kernel of the virtualization infrastructure. Such tracking can detect packet loss and the exact location of the loss.
- It records and analyzes the IP packet processing latency that is due to every network component. Such recording and analysis quantifies the performance of every packet delivery network component in the SDDC.
- PktTrace is evaluated and verified with the debugging of a real network security service provided by a VMware development partner company specializing in security products. Experimental results show that PktTrace can significantly help partners detect packet loss during debugging and provide the performance results to help remove performance bottlenecks and improve the quality of their services.
The remainder of the paper is organized as follows. Section 2 presents the motivation behind our work and the technical background. Section 3 presents the design and implementation of the PktTrace tool. Section 4 evaluates PktTrace with an actual commercially available network security service provided by a development partner-vendor. Section 5 concludes the paper and discusses future work.
In this section, we present the basic architecture of NetX. This framework motivated us to develop an IP packet tracking tool: PktTrace, which complements the functionality of network service management. We also describe two underlying tools—VProbes and pktCapture for PktTrace development—in this section.
NetX is a network service extensibility framework developed by VMware for its SDDC development and management. NetX enables customers and partners to develop their own device, service, and operational models to match their specific network service delivery needs. As part of the NetX framework, customers and partners receive development tools for integrating their own network service implementations into the VMware SDDC ecosystem.
NetX infrastructure can be used by the partner service in multiple ways: The service can be a VM that is in the same host as the VMs that it protects, or the service can be on L2 or L3 networks such that it protects multiple VMs on multiple hosts over a network connection. The service can also be hosted on separate hardware and not be a VM at all . In all cases, as long as the service involves the insertion of rules into the kernel of hosts that are running the protected VMs, our PktTrace tool can be used to track rule behavior. This paper focuses on the case of the service VM being hosted on the same host as the protected VMs.
By leveraging NetX, customers and partners are able to develop and integrate network services into the SDDC service model. Figure 2 shows the workflow of using NetX to integrate and manage SVMs in SDDC. Figure 2 is partially cited from . In Figure 2, there are PVM1 and PVM2, two PVMs and a SVM deployed to support traffic filtering. PVM1 does not have a NetX data plane agent, as illustrated in Figure 2. In this case, any PVM1 IP packet need not be forwarded and processed by any network service. As illustrated by PVM2, when a NetX data plane agent is present, any PVM2 IP packet needs to be checked and forwarded to the specified network service residing in the SVM. Customers and partners deploy and configure data plane and plane agents with a management controller through VMware vCloud® infrastructure.
2.2 VProbes and pktCapture
VProbes is a tool developed by VMware for its virtualization infrastructure and application performance probing . A set of probes has been previously embedded within both the virtualization infrastructure and its supporting libraries. Through selective activation of these probes, various data points within the software hierarchy at the application, library, or kernel level can be recorded and correlated with one another to form a highly detailed profile of any targeted application. VProbes enables system-level users to understand what is occurring in the virtual infrastructure and a VM on which the VM runs.
To dynamically enable probes and define instrumentation routines via VProbes, a C-like scripting language is utilized that provides support for features such as variables and control structures. VProbes provides static and dynamic probe modes. By combining these mechanisms with other facilities provided by the scripting language, developers can easily form complex instrumentation routines that collect data from probes in a logical and organized manner. For tracing and detecting IP packets, this paper includes a set of scripts have been developed to capture and analyze IP packets. Users can turn on the instrumentation with the scripts and execute them on demand. The IP packets are intercepted at the specified probe points and analyzed. The analysis results are saved into an output file for further analysis.
pktCapture is a packet-capture tool developed by VMware that can capture IP packets in the networking appliances in a VMware SDDC. The current pktCapture provides the ability to capture packets as seen by any packet processing functions associated with the networking appliances, such as the uplink module, virtual network interface card module, and virtual filters. pktCapture is tailored into our project only for packet trace in the virtual networking appliance. In the virtual kernel, packet tracing still must be implemented by the VProbes scripts.
Our implementation of PktTrace tailors the use of VProbes and pktCapture, builds a framework around them, and makes them suitable for the NetX use case.
3. Design and Implementation
PktTrace is an IP packet tracing tool for the VMware SDDC to help users detect possible network service failures or performance degradation. PktTrace can be used to debug network services after they have been integrated into the data center. The core idea of our design is to place multiple detection points along the IP packets path and check the packets in the live network stream.
3.1 Detection Points of PktTrace
Figure 3 illustrates the packet detection points specified by PktTrace in single-host mode. Four core detection points are developed in PktTrace in this paper:
- The point where the packet arrives
- The point of departure of the packet from the kernel to the SVM (which is the point of rule insertion
- The point where the packet returns from the SVM into the kernel (which is the post point of rule insertion)
- The point where the packet is ready to send to the PVM, in which we have tracked and verified the “hitting” of the rule meant for this tailored packet and therefore conclude that the rule functions correctly
In Figure 3, there are two paths working in the SDDC. The red arrows show the management path, which enables the configuration and integration of network services in the SDDC. The management path includes three parts:
- Virtual appliance <=> partner service manager
- Partner service manager <=> VMware® vCenter™/VMware® vShield™
- vCenter/vShield manager <=> packet filter
In this path, users configure, start, and stop service appliances in the VMware hypervisor (VMware® ESXi™). In addition, users need to register the rule in the packet filter. The incoming and outgoing packets are detoured to the service virtual appliance if the rule is hit in the filter. This filter is opaque to users, so one of most important functions of PktTrace is verifying the “hitting” of rules in the filter. In Figure 3, the green arrows show the IP packet processing path in the SDDC. The path includes five parts:
Physical NIC => virtual switch
Virtual- switch port => filter
Filter => virtual appliance
Virtual appliance => filter
Filter => virtual NIC
Figure 3 doesn’t show the egress data path, because the egress packet processing follows a similar path, albeit in a different direction than ingress processing. The detection points check the IP packet on the data path.
3.2 Interface for Using PktTrace
This section introduces PktTrace in the ESXi hypervisor. A command-line interface (CLI) is the primary user interface for using PktTrace. The user executes the pktTrace CLI command and specifies the source, destination, and switch protocol addresses and ports, as in the following command instruction:
pktTrace –h –c –m –proto –s –sport –d –dport –uplink –switchport --vmnicbacken –h, --help –c COUNT the number of packets to be checked, the maximum is 64 –m MODE the mode, i.e. latency measurement or rule verification --proto=PROTO the protocol type: 1 for ICMP, 6 for TCP, 17 for UDP –s SRCIP, --source=SRCIP --sport=SPORT specify the port number on the client node –d DSTIP, --destination=DSTIP --dport=DPORT specify the destination port number on the server node --uplink=UPLINK specify the name of the physical nic, e.g. vmnic1 --switchport=SWITCHPORT specify the switch port needs to be traced --vnicbackend=VNICBACKEND, the vnic backend port to be traced
In the CLI there are two modes for network service verification. The -m option enables users to switch the verification between two modes. Mode 0 measures the packet processing latency with SVM. When a PVM employs the SVM to preprocess its packets, the packets are detoured to the SVM, which adds extra latency to packet delivery. Such latency is an issue for some time-critical applications, such as real-time data processing. If the deadline cannot be reached with extra-service enable, the corresponding SVM should be removed or replaced in the SDDC. Mode 0 detects the latency of SVMs. Mode 1 verifies the functionality of rules in the filter. The SVMs need to co-work with rules for packet processing. When the packets arrive at the filter, it checks its rules to make sure the packet matches the description of the rule and forwards the packet to the corresponding SVM. In such actions, users need PktTrace to help them detect the packet forwarding if packet loss happens.
4. Performance Evaluation
4.1 Experimental Setup
Experiments were run on two machines—one ESXi server and one client system. Figure 4 shows the architecture of the experimental platform. Machine A is an x86 server equipped with 24GB of memory and two 1Gbps NIC ports. Machine B is a common desktop with a 100Mbps network connection. ESXi (Release 5.5) is configured and installed on Machine A. Three protected VMs and one service VM are running on top of ESXi. The service VM is a simple security service provided by a development partner. The service doesn’t perform any concrete task and just forwards the packets. Three protected VMs are configured as general network servers to provide TCP/UDP access. Network access to Machine A is created on Machine B .
To enable the SVM for IP packet processing, one firewall rule is added into the filter, as in Figure 5. This rule includes the specification of the action and protocol type. In this rule, all TCP packets need to be forwarded to the service for further processing.
4.2 Latency Measurement
Mode 0 is selected for packet processing latency measurement. In this mode, the timestamps have been collected and analyzed on all the packet processing points along the data path. The rule diverts the packets to the SVM if the packets meet the criteria specified in the rule. Detection is inserted at the point of departure of the packet after the rule-hit into the SVM and at the point of arrival back into the kernel, after the packed is processed by the SVM.
The tailored packets are inserted, and it is ensured that they are sent to the SVM and return from the SVM. The difference in the timestamps at both the points determines the latency, which is the time taken up by the service. This is not an accurate measurement of latency but it does provide a very quick measure. PktTrace itself collects the timestamps with VProbes, which is very lightweight . Because the relative timing is measured by PktTrace, the impact of PktTrace itself adding to the latency (the so-called Heisenberg effect ) is minimal.
Figure 6 shows a snapshot of the experimental results. Two-way timestamps (ingress and egress) were collected and analyzed. The results show that the latency is a significant cost in stage 2 for the ingress stream and in stage 0 for egress. The stages record the latency of service in two directions. The average latency of stage 2 for ingress packets is ~195.02us, which accounts for 84.7% of the whole latency. The average latency of stage 0 for egress packets is ~31.91us, which also accounts for about 81.5% of the whole latency. Such measured results show the performance impact of a service.
4.3. Rule Verification
Mode 1 is selected for rule verification in the filter of the ESXi kernel. The packet filter is the only module to coexist with the services in ESXi. Therefore, the functionality verification of rules in this filter is a very important step in service debugging. In this mode, rules are inserted by the partner as needed by the design and applied to incoming and outgoing packets. In this paper, we limit the discussion to incoming packets because the outgoing packet behavior is a mirror image. When an arriving packet meets a rule that the rule is designed for, then the packet is dealt with by the action associated with that rule. In our experiments, the associated action is ACTION_PUNT, which forwards the matched packets to the service. The tailored packets are inserted to help PktTrace verify the functionality of rules. In such experiments, the protocol type is changed to 1 – ICMP packets forwarding.
Figure 7 shows a snapshot of the experimental results. Two-way timestamps (ingress and egress) were collected and analyzed. The results show that the tailored packets are detected on the specified points, that the rule was hit in the filter, and that the packets were forwarded to the SVM.
The experimental results show that PktTrace is capable of verifying the functionality of rules and detecting the performance cost of services.
5. Conclusions and Future Work
PktTrace is available as an ESXi command-line option. It has options for rule verification and latency measurement. Rule verification has a “detailed” option whereby the packet dumps are observed, and a default option for nondetailed behaviors. Use of PktTrace will significantly accelerate service development by VMware partners.
In the future, the tool will be integrated with the virtualized networkbased releases. It will be available for use by other technologies such as EPSec, an endpoint security service integration program. The NetX and EPSec certification suites will be enhanced to include this tool for data path–based certification.
- Delivering on the Promist of the Software-Defined Data Center. VMware Accelerate Advisory Services. VMware Inc., 2013.
- VMware Network Extensibility Developer’s Guide, Technical Document, VMware Inc., 2013.
- Martim Carbone, Alok Kataria, Radu Rugina, and Vivek Thampi, “VProbes: Deep Observability into the ESXi Hypervisor,” VMware Technical Journal, Summer 2014.
- C. Du and H. Zou. “V-MCS: A Configuration System for Virtual Machine,” Proc. of Workshop in conjunction with Cluster’09.
- K. Hutton, M. Schofield, and D. Teare. Designing Cisco Network Service Architectures, 2d ed. Cisco Press, 2009.
- “SDN: The New Norm for Networks.” Open Networking Foundation, 2012.
- M. Raymer, “Uncertainty Principle for Joint Measurement of Noncommuting Variables,” American Journal of Physics, 1994.
- H. Zou, W. Wu, et al., “An evaluation of Parallel Optimization for OpenSolaris Network Stack,” Proc. of the 37th Annual IEEE Conference on Local Computer Networks (LCN), 2010.