VMware Updates Big Data Extensions with Hadoop 2 Support

Built-in vSphere tools help run Hadoop in virtualized environments.

VMware Inc. on Friday updated its Big Data Extensions (BDE) for its vSphere virtualization platform, including support for Hadoop 2.

The BDE set of integrated management tools -- built into vSphere -- helps organizations deploy, run and manage Hadoop. With BDE, vSphere users can install and configure Hadoop clusters via the vCenter management tool UI.

Friday's announcement came exactly two years after VMware announced its related open source project Serengeti to facilitate the running of Apache Hadoop in virtualized environments. BDE is the enterprise version of project Serengeti, bundled with commercial support. The software deploys Hadoop components such as Hadoop Distributed File System (HDFS), MapReduce, Pig, Hive and HBase on the vSphere platform.

With the release of BDE 2.0, the software now supports the latest distributions of Apache Hadoop 2.0. Specifically, new distributions supported include Apache Bigtop 0.7.0, Cloudera CDH5, Hortonworks HDP 2.1, MapR 3.1 and Pivotal PHD.2.0.

Other enhancements include: the Hadoop Template Virtual Machine now uses CentOS 6.4 as its default OS; the Serengeti Management Server now supports IPv6 network addressing; new support for Internationalization Level 1; a central Web UI called the Serengeti Management Server Administration Portal for viewing, managing and troubleshooting Serengeti Services; and improved error handling.

VMware also combines BDE with its vCloud Automation Center to provide an on-premises Hadoop as a Service offering

"BDE enables customers to run clustered, scale-out Hadoop applications on the vSphere platform, delivering all the benefits of virtualization to Hadoop users," VMware said. "BDE delivers operational simplicity with an easy-to-use interface, improved utilization through compute elasticity, and a scalable and flexible Big Data platform to satisfy changing business requirements."

For the under-the-hood details about how BDE works, VMware provides the following:

BDE is a downloadable virtual appliance integrated as a plug-in to vCenter server. BDE requires a vSphere 5.0 or later license and an Enterprise or Enterprise Plus license. The Serengeti virtual appliance runs on top of vSphere and includes two virtual machines (VMs): Serengeti Management Server and the Hadoop Template Server. The Serengeti Management Server handles creation of the cluster, including creation and configuration of the VMs and assignment of Master node and Slave node roles. Once the cluster is created, the Serengeti Management Server then clones the Hadoop template to create and scale out the cluster. Once this is complete, the Serengeti Management Server starts the Hadoop service. BDE is controlled and monitored through the vCenter server.

vSphere is available for download as a free trial and pricing is available on the company's Web site.

About the Author

David Ramel is an editor and writer for Converge360.


Subscribe on YouTube