vSphere Big Data Extensions is available as a fully supported product for use with vSphere Enterprise and Enterprise+ Editions. Official support for Big Data Extensions on vSphere Standard Edition is not currently available. This Fling lets you deploy Big Data Extensions on vSphere Standard Edition for the purpose of experimentation and proof-of-concept projects to explore the potential of running Hadoop in a virtualized environment.
Big Data Extensions runs on top of Project Serengeti, the open source project initiated by VMware to automate the deployment and management of Apache Hadoop and HBase on virtual environments such as vSphere. Big Data Extensions simplifies the Hadoop and HBase deployment and provisioning process, gives you a real time view of the running services and the status of their virtual hosts, provides a central place from which to manage and monitor your Hadoop and HBase cluster, and incorporates a full range of tools to help you optimize cluster performance and utilization.
Big Data Extensions provides the following features:
- Support for Major Hadoop Distributions. Big Data Extensions includes support for Apache Hadoop, Cloudera, Greenplum, Hortonworks, MapR, and Pivotal. HBase, Pig, and Hive are also supported. The Big Data Extensions virtual appliance includes Apache Hadoop 1.2.
- Quickly Deploy, Manage, and Scale Hadoop Clusters. Big Data Extensions enables the rapid deployment of Hadoop clusters on VMware vSphere. You can quickly deploy, manage, and scale Hadoop nodes using the virtual machine as a simple and elegant container. Big Data Extensions provides a simple deployment toolkit that can be accessed though VMware vCenter Server to deploy a highly available Hadoop cluster in minutes using the Big Data Extensions user interface.
- Graphical User Interface Simplifies Management Tasks. The Big Data Extensions plug-in, a graphical user interface integrated with vSphere Web Client, lets you easily perform common Hadoop infrastructure and cluster management administrative tasks.
- Elastic Scaling Lets You Optimize Cluster Performance and Resource Utilization. Elasticity-enabled clusters start and stop virtual machines automatically and dynamically to optimize resource consumption. Elasticity is ideal in a mixed workload environment to ensure that high priority jobs are assigned sufficient resources. Elasticity adjusts the number of active compute virtual machines based on configuration settings you specify.
vSphere 5.0 (or later) standard edition.
NOTE: The Big Data Extensions graphical user interface is only supported when using vSphere Web Client 5.1 and later. If you install Big Data Extensions on vSphere 5.0, you must perform all administrative tasks using the command-line interface.
When installing Big Data Extensions on vSphere 5.1 or later, you must use vCenter Single Sign-On (SSO) to provide user authentication. When logging into vSphere 5.1 or later, you pass authentication to the vCenter Single Sign-On server, which you can configure with multiple identity sources, such as Active Directory and OpenLDAP. On successful authentication, your username and password is exchanged for a security token which is used to access vSphere components such as Big Data Extensions.
Enable the vSphere Network Time Protocol on the ESXi hosts. The Network Time Protocol (NTP) daemon ensures that time-dependent processes occur in sync across hosts.
Port group (or dvportgroup) with at least 6 uplink ports that has connectivity with the dvportgroups used to deploy your Hadoop clusters.
40GB or more (recommended) disk space for the management server and Hadoop template virtual disks.
Please refer to the 'How to Deploy Big Data Extensions on vSphere Standard Edition for the deployment instructions.
When the BDE management server VM and Hadoop template VM are deployed, refer to vSphere Big Data Extensions Administrator's and User's Guide and CLI Guide.
Hui HuBig Data team
Jun WangBig Data team
Emma LinBig Data team
Xinhui LiBig Data team
Xiaoding BianBig Data team
Lizhao DuBig Data team
Binbin ZhaoBig Data team
Yifeng XiaoBig Data team