KubeCon 2025: Exploring the KubeCon Ecosystem, Part 1 -- Virtualization Review

KubeCon 2025: Exploring the KubeCon Ecosystem, Part 1

By Tom Fenton
11/17/2025

KubeCon + CloudNativeCon North America 2025, which was held in Atlanta during the second week of November, was a showcase for innovative technologies.

The Kubernetes (K8s) and Cloud Native ecosystem showcased at this conference is extensive. See my previous coverage of Day 0, Day 1 and Day 2 for more details on the big show.

During the event, I had the opportunity to speak with representatives from many companies. As always, I was impressed not only by the scope but also by the depth of the technology on display. It's worth noting that this technology originates from both established companies and startups.

Due to the large number of companies present at the event, it was challenging to select the ones I wanted to highlight for this article. The large number of projects associated with K8s is best represented by the Projects and Products Landscape chart created by the Cloud Native Computing Foundation (CNCF). Due to the large number of products and projects represented on the chart, it is not possible to reproduce it in a viewable manner. However, you can view the chart in its entirety online.

The CNCF members that develop and support this technology range from some of the world's largest IT companies to companies with just a few employees. Again, the number of members makes it impossible to see them on the chart below, but you can see the full chart online.

After careful consideration, I have decided to give you the broadest range of companies and products possible. These range from mature, well-established companies like Nutanix, which delivers a complete, turnkey Kubernetes solution, to Tailscale, a startup laser-focused on providing network connectivity to containers running in Kubernetes.

Before discussing these companies, I would like to share with you a very interesting panel that I had the opportunity to attend.

Cloud Native AI Production Roundtable
The key insights I took away from the Cloud Native AI Production Roundtable focused on the intersection of Artificial Intelligence and cloud-native technologies. The discussion revealed a consensus among the panelists that Kubernetes has become the de facto platform for AI workloads. The rapid adoption within the community drove this trend, eventually reaching enterprise customers. This adoption was fueled by the ability to leverage existing investments and skills in cloud-native infrastructure, avoiding the need to build a separate, parallel stack for AI.

The panel, expertly moderated by Natasha Woods of the CNCF, consisted of the following industry experts:

Panelist	Title & Company	Key Area of Expertise
Lachlan Evanson	Principal Product Manager, Azure (Microsoft)	Kubernetes Steering Committee, cloud-native open source
Brandon Royal	Product Manager, Google Cloud	Agentic AI systems, AI infrastructure at scale
Keith Babo	Vice President Product, solo.io	Agentic infrastructure, cloud-native application networking
Hong Wang	Co-founder and CEO, Akuity	GitOps, cloud-native control planes, and distributed systems

The panelists unanimously confirmed that Kubernetes is a no-brainer choice for running AI. It provides a stable, extensible, and predictable distributed means for all three pillars of AI workloads: training, inference, and agentic systems.

Tom's Tip - Agentic systems are AI systems designed to take actions autonomously based on goals, context, and real-time information. They behave more like "agents" than tools.

While inference on Kubernetes is mature, the emergence of agentic AI is creating new infrastructure challenges. These systems, which integrate with numerous external tools, are driving a fundamental change in what security and observability look like, leading to a resurgence in technologies like service mesh to manage complex, large-scale interactions.

Initiatives like the new CNCF AI Conformance for Kubernetes (which was announced on the first day of the conference) are critical for establishing a reliable baseline, signaling to the ecosystem that the platform is ready for prime time. The goal is to create standard abstractions for hardware accelerators (GPUs, TPUs) and networking, allowing the community to innovate on higher-level problems without getting bogged down in low-level implementation details. Lachlan Evanson put it succinctly when he said:

"We can put a stamp and signal to the community that Kubernetes as a platform with this set of APIs is ready for prime time, so the platform Builders...can come in, take that as standard, and build brand new communities and tools."

This allows tool creators and platform teams to build with confidence, knowing that the underlying Kubernetes distribution will perform as required.

One of the significant challenges is bridging the gap between AI engineers, who typically work with Python frameworks and are not experts in Kubernetes, and the underlying infrastructure. The solution is not to force AI developers to learn Kubernetes, but to meet them where they are by providing interfaces and tools that abstract away the complexity of the platform.

The industry is shifting from simple cost optimization to day one value optimization due to the scarcity of AI accelerators. Concurrently, technologies like service mesh are evolving from complex, sidecar-based models to simpler, more integrated ambient and sidecarless approaches that are better suited to the new security and traffic management demands of AI.

The demands of AI are pushing Kubernetes on two different scaling dimensions. On the one hand, we have massive single clusters that are used for large-scale distributed training, and a handful of organizations require enormous clusters. Brandon Royal noted that Google recently announced GKE support for 130,000 nodes in a single cluster to meet this demand.

On the other hand, we have massive multi-clusters that are used for edge AI use cases; the challenge is managing a huge number of smaller clusters. Hong cited a restaurant chain planning to run GPUs in individual restaurant locations, resulting in a massive number of endpoints to manage.

The ultimate success of Kubernetes for AI is to make it boring -- that is, a stable, reliable, and invisible foundation. When Kubernetes achieves this, engineers can submit a job and trust the platform to run it correctly, without needing to understand the underlying nodes, schedulers, or kernels.

Vendor Recaps
Below is a recap of some of my conversations with vendors at KubeCon. I attempted to encompass a wide range of technologies and vendor sizes, from startups with one or two employees to large, well-established companies such as Nutanix and SUSE.

Komodor
Founded in 2020, Komodor emerged from stealth with the mission of simplifying the operational complexity of Kubernetes environments. Komodor has grown its customer base dramatically during that time and has extended its product capabilities to meet the demands of day-2 operations, cost-optimization, and drift detection in multi-cluster, multi-cloud Kubernetes landscapes. Large and small companies, including Intel, Priceline, Cisco, and OpenTable are using it.

Essentially, Komodor positions itself as an autonomous AI Site Reliability Engineering (SRE) platform designed to manage the increasing complexity of day-two Kubernetes operations. Its core AI agent continuously analyzes cluster health, identifies root causes, and recommends or executes remediations. This offloads a large portion of work traditionally handled by SRE and DevOps teams. The platform benefits from a deeply trained AI model with high reported accuracy and integrates tightly with the cloud-native ecosystem.

I asked how difficult it is to deploy, and they said it was designed for simplicity, allowing for deployment with a single Helm command. It supports a wide range of use cases, from resolving GPU-related failures in AI training workloads to empowering non-experts to diagnose and solve issues.

They also emphasize that they have a reliability-first approach to cost optimization, distinguishing themselves from tools that prioritize savings at the expense of system stability. Its AI intelligently right-sizes workloads, optimizes pod placement, manages spot instances, and reduces over-provisioning while ensuring performance and availability remain intact.

When I asked about what was on their road map, they said they are going to extend their capabilities to AI inferencing workloads. The company aims to become the central AI-driven operations layer for the entire AI/ML lifecycle running on Kubernetes.

In part 2 of this post, I will cover a few more of the companies that I wanted to highlight.

Featured

Cloud AI Frontiers: Investments with Layoffs, PCs & Threats

Subscribe on YouTube

Virtualization Review

Email Address*Country*

Please type the letters/numbers you see above.

Upcoming Training Events

0 AM

VSLive! 4-Day Hands-On Training Seminar: Immersive .NET Full Stack Training: 4-Day Hands-On Experience
December 16-19, 2025

Live! 360 6-Week Training & Certification Course: Mastering the Microsoft AI Framework: Building Enterprise-Ready AI Agents with Microsoft Foundry
March 10-April 14, 2026

Visual Studio Live! Las Vegas
March 16-20, 2026

Live! 360 2-Day Hands-On Seminar: Copilot Studio, Microsoft Agent Framework and Foundry: Building Multi-Agent AI Systems
June 8-9, 2026

VSLive! 4-Day Hands-On Training Seminar: Immersive .NET Full Stack Training with CoPilot: 4-Day Hands-On Experience
July 14-17, 2026

Visual Studio Live! @ Microsoft HQ
July 27-31, 2026

Visual Studio Live! @ San Diego
September 14-18, 2026

Live! 360 Orlando
November 15-20, 2026

Artificial Intelligence Live! Orlando
November 15-20, 2026

AI Enterprise Architecture Live! Orlando
November 15-20, 2026

Cybersecurity & Ransomware Live! Orlando
November 15-20, 2026

Data Platform Live! Orlando
November 15-20, 2026

Visual Studio Live! Orlando
November 15-20, 2026

VSLive! 4-Day Hands-On Training Seminar: Immersive .NET Full Stack Training with CoPilot: 4-Day Hands-On Experience
December 15-18, 2026

Free White Papers

More Tech Library

Topics