In-Depth

KubeCon + CloudNativeCon North America 2025 -- Day 2

After an exciting first day at KubeCon, I was eager for another round of keynotes on day two. Watching crowds make their way to the sessions, I thought about how terms like "cloud native" and "AI infrastructure" are mentioned so frequently that they often feel vague and far removed from everyday experience. They can seem like concepts confined to distant data centers, which makes it easy to forget how users interact with them or how IT professionals manage them in practice. I was glad to learn that the second day's keynotes would shine a spotlight on real-world cases, showing how companies are tackling tough IT challenges that truly affect people.

KubeCon Day 2
[Click on image for larger view.]

The second day of KubeCon keynotes focused on practical IT experiences, featuring real-world stories from CNCF engineers and organizations. Topics ranged from the huge savings achieved by optimizing with a single line of code to modernizing legacy applications using container technology. The day also emphasized the human impact of IT work and recognized outstanding community members with CNCF awards.

How A Single Line of Code Saved 30,000 CPU Cores at OpenAI
OpenAI (the folks behind ChatGPT) does logging at a staggering scale, processing 10 petabytes of logs every day. To do this, they utilize the open-source tool Fluent Bit, which runs on every Kubernetes node. It comes as no surprise that OpenAI has an insatiable appetite for computing power, and as such, every CPU core is precious. Any core used by the logging agent is one that can't be used for other critical services.

KubeCon Day 2
[Click on image for larger view.]

Fabian Ponza, a member of the technical staff at OpenAI, examined Fluent Bit's performance and set out to optimize it. His initial thought was that the bottleneck must be expensive string processing. However, after using the low-level system profiling tool "perf," he found that he was wrong. The data revealed something far different: the process was spending 35% of its time in fstat64, a system call that checks a file's size.

The culprit was the inotify feature. Every time a new log line was written, inotify would trigger a file stat check to get the new size. Ponza's team made a one-line change to this feature. The result was incredible. This simple fix reduced Fluent Bit's CPU usage by 50%, regaining approximately 30,000 CPU cores' worth of capacity.

This story serves as a powerful reminder of the immense value of low-level system profiling, even in an era characterized by horizontal scaling. Sometimes we get into the habit of throwing more resources at a problem rather than spending the time to investigate it. Sometimes the most significant and efficient gains come from truly understanding what's actually happening. Or as Fabian Ponza said

"Still, just because our community's gotten so good at scaling things horizontally doesn't mean that there's not value in breaking out your profiler of choice and seeing what's really happening under the hood and on your hardware."

One of the more interesting things that I took away from this presentation was that this huge saving was caught by a human, not by AI.

How MailChimp Moved a 20-Year-Old Monolith While Developers Barely Noticed
After its acquisition by Intuit, MailChimp faced a monumental task: migrate its core product to Intuit's modern developer platform. This wasn't just any application. It was a 20-year-old monolith that serves 11 million users and sends around 700 million emails daily.

KubeCon Day 2
[Click on image for larger view.]

In her presentation, Mara Kelly, Director of Engineering at Intuit, revealed that they had successfully migrated the entire application to a container-based architecture. But the most surprising outcome wasn't just that it worked. It was that most developers did not even notice it was happening.

This seamless transition was made possible by an Intuit concept called "done for you." The philosophy is simple: build all common components, such as compliance, security, and operational, directly into the platform itself. This frees developers from having to worry about managing infrastructure, allowing them to focus entirely on building customer-facing features. This is a masterclass in the goals of Platform Engineering: using a well-designed internal platform as a powerful abstraction layer, massive, complex, and potentially disruptive backend migrations become almost invisible to the engineers who depend on those systems every day. Mara Kelly summarized it by saying.

"So, if you know that all of your services have to have certain things? Build it in, make it part of the platform. Do it for your developers."

Airbnb's Engineers Now Use AI Coding Tools
Even for a company at the forefront of the cloud-native movement, and one that was an early adopter of Kubernetes and Istio, the pace of change can be surprising. According to a presentation by Adam Hablowski, an engineer on Airbnb's infrastructure team, a quiet revolution has occurred within the company, as most of Airbnb's developers are now active users of AI coding tools.

And this isn't just about generating new code. Hablowski highlighted a specific and highly practical case that is accelerating development as engineers are now utilizing AI assistance to expedite the often tedious yet necessary task of modernizing and migrating older systems and codebases.

This highlights the rapid evolution of AI assistants from the hype cycle to becoming a core, practical tool in the software development lifecycle, even within mature engineering organizations. The value of AI coding tools isn't limited to writing new code; they are proving indispensable for addressing the essential maintenance that keeps complex systems running smoothly.

Technology is Saving Lives
The same powerful, open-source tools that enable massive tech companies to scale are also empowering non-profits and humanitarian organizations to achieve their missions on a global level. A panel on "Cloud Native for Good" highlighted that the efficiency and accessibility of this ecosystem allow organizations with limited budgets to solve some of the world's most complex human-facing issues.

KubeCon Day 2
[Click on image for larger view.]

The stories from these organizations remind us that our actions have real-world implications and have positively impacted people's lives, improving their situations.

The Child Rescue Coalition uses a suite of CNCF projects to process 30 to 50 million leads every day to help track and convict child predators across 106 countries.

KubeCon Day 2
[Click on image for larger view.]

The American Red Cross utilizes tools like Kubernetes and Karpenter to dynamically scale its infrastructure, thereby reducing overhead costs and ensuring that "every dollar spent goes further to mission," whether managing the nation's blood supply or responding to over 60,000 disasters annually.

KubeCon Day 2
[Click on image for larger view.]

The United Nations, through Project Giga, utilizes cloud-native tools such as Kubernetes and Prometheus to assist governments in the global South in connecting schools to the internet, mapping access, and building essential infrastructure.

These stories prove that the impact of cloud-native technology extends far beyond corporate data centers. The collaborative nature of open-source software means that every contribution, whether it involves code, documentation, or community support, has the potential to have a ripple effect throughout the community. It empowers organizations on the front lines of humanitarian crises to operate at a scale that was previously unimaginable. Roberto Mortaro of the Child Rescue Coalition gave a huge shout-out to the CNCF community.

"If you're a project maintainer, contributor, or end user: thank you. You are effectively contributing to our cause." -

Awards
The CNCF is comprised of great people, and at KubeCon + CloudNativeCon North America, the CNCF recognized a few of the individuals and organizations that have made significant contributions to the cloud-native ecosystem, spanning technical leadership, community engagement, mentorship, and end-user adoption.

KubeCon Day 2
[Click on image for larger view.]

Below is a list of the award winners

Lifetime Achievement Award: Recognizing longstanding impact - recipients: Dawn Chen & Kevin Wang

Top End User Award: Awarded to an organization driving real-world use and sharing best practices - recipient: Michelin

Top Committer Award: For outstanding technical contribution across CNCF projects - recipient: John Howard

Chop Wood Carry Water Award: Honoring the unsung heroes doing backbone work for the community -- recipients: Mario Jason Braganza, Lubomir I. Ivanov, Daniel Hawton, Janet Kuo, Yuichi Nakamura, and memorial recognition for Han Kang

Outstanding Mentor Award: For sustained mentorship efforts - recipient: Lee Calcote

Cloud Native Hero Award: Recognizing individuals defending open source from patent threats via the "Heroes Challenge" - recipients: Chris Buccella, Ketan Sachdeva, Ritu Tyagi

TAGGIE Award: Celebrating contributions to CNCF's Technical Advisory Groups (TAGs) - recipients: Dawn Foster, Marina Moore, Leo Pahlke, Mauricio Salatino

End User Case Study Contest: Highlighting real-world adoption of CNCF technologies - winner: OpenAI for their work with Fluent Bit at petabyte-scale logging

Top Documentarian Award (also "Lorem Ipsum Award"): For excellence in documentation across projects - recipients: Aidan Delaney, Tiffany Hrabusa, Seokho Son

After the keynotes, I headed back to the show floor, where I talked with various projects and vendors about their products. I will cover some of those discussions in another post.

KubeCon Day 2
[Click on image for larger view.]

After the day ended, I attended a few after-hours parties, and I give a hat tip to the Après Kube ATL: Blues, Brews, and BBQ event, put on by Cloudsmith, Chainguard, Mend.io, Sysdig, and Tailscale. They picked a great location, had great food, and had some good music playing in the background that was loud enough to be enjoyed, but not too loud to prevent discussions.

Reflections on the Day
As I walked back to the hotel, I reflected on how much I enjoyed the day. The stories from the cloud-native community were about more than just technology. They were about the profound impact on business and society that open-source collaboration makes possible. From saving millions on computing costs to saving lives, this ecosystem is shaping our world in ways both seen and unseen.

In my next article, I will share with you some of the discussions I had with various vendors at the show.

Featured

Subscribe on YouTube