OpenStack improves support for AI workloads
OpenStack allows enterprises to manage their own AWS-like private clouds on-premises. Even after 29 releases, it’s still among the most active open source projects in the world and this week, the OpenInfra Foundation that shepherds the project announced the launch of version 29 of OpenStack. Dubbed “Caracal,” this new release emphasizes new features for hosting AI and high-performance computing (HPC) workloads.
The typical OpenStack user is a large enterprise company. That may be a retailer like Walmart or a large telco like NTT. What virtually all enterprises have in common right now is that they’re thinking about how to put their AI models into production, all while keeping their data safe. For many, that means keeping total control of the entire stack.
As Nvidia CEO Jensen Huang recently noted, we’re at the cusp of a multitrillion-dollar investment wave that will go into data center infrastructure. A large chunk of that is investments by the large hyperscalers, but a lot of it will also go into private deployments — and those data centers need a software layer to manage them.
That puts OpenStack into an interesting position right now as one of the only comprehensive alternatives to VMware’s offerings, which is facing its own issues as many VMware users aren’t all that happy about its sale to Broadcom. More than ever, VMware users are looking for alternatives. “With the Broadcom acquisition of VMware and some of the licensing changes they’ve made, we’ve had a lot of companies coming to us and taking another look at OpenStack,” OpenInfra Foundation executive director Jonathan Bryce explained.
A lot of OpenStack’s growth in recent years was driven by its adoption in the Asia-Pacific region. Indeed, as the OpenInfra Foundation announced this week, its newest Platinum Member is Okestro, a South Korean cloud provider with a heavy focus on AI. But Europe, with its strong data sovereignty laws, has also been a growth market and the U.K.’s Dawn AI supercomputer runs OpenStack, for example.
“All the things are lining up for a big upswing and open source adoption for infrastructure,” OpenInfra Foundation COO Mark Collier told TechCrunch. “That means OpenStack primarily, but also Kata Containers and some of our other projects. So it’s pretty exciting to see another wave of infrastructure upgrades give our community some important work to complete for many years to come.”
In practical terms, some of the new features added to this release include the ability to support vGPU live migrations in Nova, OpenStack’s core compute service. This means users now have the ability to move GPU workloads from one physical server to another with minimal impact on the workloads, something enterprises have been asking for because they want to be able to manage their costly GPU hardware as efficiently as possible. Live migration for CPUs has long been a standard feature of Nova, but this is the first time it’s available for GPUs as well.
The latest release also brings a number of security enhancements, including rule-based access control for more core OpenStack services like the Ironic bare-metal-as-a-service project. That’s in addition to networking updates to better support HPC workloads and a slew of other updates. You can find the full release notes here.
This update is also the first since OpenStack moved to its “Skip Level Upgrade Release Process” (SLURP) a year ago. The OpenStack project cuts a new release every six months, but that’s too fast for most enterprises — and in the early days of the project, most users would describe the upgrade process as “painful” (or worse).
Today, upgrades are much easier and the project is also far more stable. The SLURP cadence introduces something akin to a long-term release version, where, on an annual basis, every second release is a SLURP release that’s easy to upgrade to, even as the teams still produce major updates on the original six-month cycle for those who want a faster cadence.
Throughout the years, OpenStack has gone through its up-and-down cycles in terms of perception. But it’s now a mature system and backed by a sustainable ecosystem — something that wasn’t necessarily the case at the height of its first hype cycle 10 years ago. In recent years, it found a lot of success in the telco world, which allowed it to go through this maturation phase and today, it may just find itself in the right place and time to capitalize on the AI boom, too.