First Major Release of Kubeflow Machine Learning Toolkit for Kubernetes Ships -- Virtualization Review

First Major Release of Kubeflow Machine Learning Toolkit for Kubernetes Ships

By John K. Waters
03/06/2020

Kuebeflow, an open-source, cloud-native machine learning (ML) toolkit for the Kubernetes container-orchestration system, is out in version 1.0, its first major release.

"The Kubeflow project is dedicated to making deployments of machine learning (ML) workflows on Kubernetes simple, portable and scalable," its site says. "Our goal is not to recreate other services, but to provide a straightforward way to deploy best-of-breed open-source systems for ML to diverse infrastructures. Anywhere you are running Kubernetes, you should be able to run Kubeflow."

With the debut release, the maintainers of the project are graduating a core set of stable applications used to efficiently develop, build, train, and deploy ML models on Kubernetes.

The list of graduating apps in this release includes:

Kubeflow's UI, the central dashboard providing quick access to the components deployed in a Kubeflow cluster
The Jupyter notebook controller, which allows users to create a custom resource Notebook (shared document with live code, equations, visualizations, and narrative text)
TensorFlow Operator (TFJob), a Kubernetes customer resource for running TensorFlow training jobs on Kubernetes
PyTorch Operator, for distributed training
kfctl, the Kubeflow command-line interface (CLI) that's used to install and configure Kubeflow for deployment and upgrades
Profile controller and UI for multiuser management

"Kubeflow's goal is to make it easy for machine learning (ML) engineers and data scientists to leverage cloud assets (public or on-premise) for ML workloads," said Thea Lamkin, Google's open source strategist for AI/ML, in a blog post. "You can use Kubeflow on any Kubernetes-conformant cluster."

A Kubeflow Community User Survey published last December revealed that the ability to use Jupyter notebooks had emerged as a popular feature request among data scientists and ML engineers.

"With Kubeflow 1.0, users can use Jupyter to develop models," Lamkin said. "They can then use Kubeflow tools like fairing (Kubeflow's Python SDK) to build containers and create Kubernetes resources to train their models. Once they have a model, they can use KFServing to create and deploy a server for inference."

Kubeflow 1.0 also provides Kubernetes custom resources that make distributed training with TensorFlow and PyTorch simple. Distributed training was another popular feature request in that community user survey.

The Kubeflow Project was open sourced at the 2017 Kubecon USA event and has since grown "beyond our wildest expectations," Lamkin said, helped by the support of hundreds of contributors and 30 participating organizations, including Microsoft, Google, IBM, Cisco, Intel, and LinkedIn, among others.

The project evolved from an effort to open source the way Google ran its TensorFlow ML library internally, based on a pipeline called TensorFlow Extended. "It began as just a simpler way to run TensorFlow jobs on Kubernetes," the website explains, "but has since expanded to be a multi-architecture, multi-cloud framework for running entire machine learning pipelines."

"Ultimately, we want to have a set of simple manifests that give you an easy to use ML stack anywhere Kubernetes is already running, and that can self-configure based on the cluster it deploys into," the site states.

"The Kubeflow 1.0 release is a significant milestone, as it positions Kubeflow to be a viable ML Enterprise platform," said Jeff Fogarty, data science engineer at U.S. Bank. "Kubeflow 1.0 delivers material productivity enhancements for ML researchers."

The community has several more applications under development, which are planned for point updates of Kubeflow 1.0, including:

Pipelines (beta) for defining complex ML workflows
Metadata (beta) for tracking datasets, jobs, and models
Katib (beta) for hyper-parameter tuning
Distributed operators for other frameworks like xgboost

About the Author

John K. Waters is the editor in chief of a number of Converge360.com sites, with a focus on high-end development, AI and future tech. He's been writing about cutting-edge technologies and culture of Silicon Valley for more than two decades, and he's written more than a dozen books. He also co-scripted the documentary film Silicon Valley: A 100 Year Renaissance, which aired on PBS. He can be reached at [email protected].