We're Building the Future of (AI) Cloud Infrastructure
Sep 16, 2025
Today, and after years of hard work, we’re ecstatic to announce the launch of Unikraft Cloud, a radical, new cloud platform based on years of research, unikernels, and providing exponentially better scalability and efficiency (think running millions of strongly isolated instances on a few servers instead of an entire datacenter).
With that in place, it might be worth saying a few words about where we started. It may seem antithetical for a startup to say that our journey began close to 10 years ago: certainly our path hasn’t been a typical one to say the least (and I’m not even sure it’s been one I’d recommend either!).
We come from a research and open source background, back then looking into how to make software packet processing fast in software, at 10-40Gb/s (this used to be blazingly fast!); this work was a precursor to the now well-established Intel DPDK framework. From there, being the optimization geeks we are, we started moving towards doing packet processing within a virtual machine efficiently, essentially creating an operating system whose only purpose was to do packet processing, and wrapping a virtual machine around that. For those familiar with the Xen hypervisor, the beginnings of this we did with mini-os, a very simple, reference OS implementation to which we slapped the Click modular router software out of MIT, something we ended up calling ClickOS — long story short, this sort of work eventually became known as Network Function Virtualization.
As we started getting into virtualization, this got us thinking about cloud infrastructure and how it was built: no-one could argue, even back then, against the power of the cloud and its amazing functionality, but was it built efficiently, and did, and do, its components scale well, without having to throw obscene amounts of money, hardware, and electricity at the problem?
The answer for us back then, and still today, is a resounding no, and the relatively recent explosion in AI agents and AI-generated workloads is putting severe strain on how scalable legacy cloud infra actually is (again, without throwing silly amounts of money at the problem).
But I’m getting ahead of myself; getting back to the story of our beginnings, our first port of call when it came to building a radically more efficient and scalable cloud infra platform were the images themselves: GB-sized images being deployed to run applications that only needed MBs to run — no wonder virtual machines (VMs) had (still have?) a reputation for being chunky and resource hungry.
Debunking the myth that VMs, by definition, are heavyweight, was one of our
initial missions; we went as far as publishing a paper at SOSP, the top systems
conference in the world, cheekily titled “My VM is Faster (and Safer) than your
Container”, to try to
highlight that VMs need not be heavyweight, and that containers are not safe for
production deployments in the cloud (that seems obvious today, but it was far
from it back then); in that work we were getting VMs to cold start in as little
as 2 milliseconds, roughly comparable to fork
/exec
on Linux.
To make VMs small, a key insight was that for cloud deployments, we know, at build time, what the target service or application will be, so we shouldn’t be using general-purpose operating systems like Linux that can run anything. Likewise, why use extremely bloated distributions that come with everything but the kitchen sink, when all I might want to do is run a simple Python script? And in case you’re wondering why we were, and still are, so bullish about VMs as the cloud’s workhorse: it’s the only tech that provides the strong, hardware-level isolation that is required for production workloads in the cloud (no, containers aren’t a good solution for production, and in public clouds they’re more often than not deployed within a VM; and don’t even get me started on isolates and other language-level isolation primitives).
So the goal was to be able to easily create efficient VMs tailored to specific target applications; the technical term for such specialized VMs is unikernels. For those who have heard of them, unikernels are famous for providing great performance metrics (start times in milliseconds, images consisting of just a few MBs, faster and more efficient I/O, etc), but infamous for being difficult to build and generally use. To tackle this problem, back in 2018 or so we created the Linux Foundation Unikraft open source project with the explicit goal of making it easy to build efficient VMs without having to modify apps nor having to have any knowledge about operating systems, virtualization nor unikernels — and to integrate it all with familiar tooling such as Dockerfiles, Kubernetes and Prometheus.
Building the Next (R)evolution in Cloud Infrastructure Platforms
Fast forward several years and we decided that the open source project was mature enough to start building a commercial offering around it, so in 2022 we partnered up with US and EU investors and created a startup.
Naively, at first, we thought we could simply wrap our unikernels in popular public cloud formats like AMIs, deploy them on AWS, and expect performance and efficiency to go significantly up. What in retrospect seems obvious, is that when we did so almost nothing changed: when we hit a button to start an instance on the AWS dashboard, the instance would take seconds to start, not the single-digit milliseconds we were used to on our local boxes; and then I/O performance, memory consumption, etc, would be nowhere near our local benchmarking.
After some level of performance debugging, and understanding the underlying infra, it became fairly clear that if we wanted to build a platform that could be reactive to traffic in milliseconds, and that could scale to hundreds of thousands or millions of (strongly-isolated, VM-based) instances on a single server we’d have to go back to first principles, and revisit all of the main, basic components of a cloud infrastructure platform.
To say that this has been a large endeavor is putting it mildly: building controllers, proxies, snapshotting mechanisms and a number of other components from scratch has been a major, incredibly challenging (and if we’re honest fun!) undertaking, consuming the better part of 18 months of heavy engineering to get there — in addition to all the prior years of work described so far in this story.
The result is Unikraft Cloud, a platform where all workloads are strongly isolated by default, that can cold start and scale-to-zero any workload in < 10 milliseconds, and that can scale to 100K+ instances on a single server (and we’re pushing this scalability further everyday). Think running millions of instances on a few servers rather than a data center, all of it based on familiar tooling such as Docker and Kubernetes..
A Platform for AI Scale
Why does this matter for AI? AI workloads (agents, headless browsers, AI-generated code, etc.) have three main characteristics in terms of deployment:
- Massive scale: millions of agents or functions generated in short periods of time — and we’re only at the beginning of the growth curve.
- Untrusted code: increasingly code is being generated by AI for non-programmers who can’t be expected to test it for vulnerabilities or issues; in the future, ostensibly most of the deployed code will have been (vibe) “coded” by non-programmers.
- Unpredictable: Rapidly changing workload patterns mean that legacy cloud platforms must leave (idle) instances on all the time — massive scale requires massive infrastructure scale (and costs!)
Unikraft Cloud is built from the ground up to cope with the massive scale and unpredictability of such workloads since it can, just-in-time, in milliseconds, start instances as requests for them arrive, without end-users ever noticing instances were scaled down to zero. And since the basic unit of work on the platform is a virtual machine, strong isolation is a non-negotiable, hard-coded default. In short, the basic characteristics of Unikraft Cloud are:
- Minimal cloud stack: the images contain your application, and little else — no more crud.
- Super fast starts: in 10ms or less — death to cold starts
- Massive server density: 100K+ instances per server — no more idling
- Unit economics: offer services at a fraction of the cost — outpace the competition
- Zero compromises: run any service with strong isolation — no hidden gotchas
- No tooling changes: use Docker, Kubernetes, Prometheus — all the standard tools you know and love
I should also note what may seem obvious: while the platform is clearly an amazing fit for AI workloads, it is application-agnostic, and we have found it to be equally powerful in use cases as diverse as databases, build pipelines, functions, remote IDEs, ETL pipelines, and logging and monitoring, to name a few. Basically anything that needs scale, blazingly fast scale to zero, or any workloads where the ratio of the time it takes for an app to start and it doing useful work is poor (CI/CD pipelines I’m looking at you) our platform is perfect for.
But don’t take our word for it, sign up for free and take it up for a spin — or drop me a line, I’d be happy to talk about how Unikraft Cloud may help you, and, if you’re curious, to tell you more details about the tech behind it. Oh, and if you love big engineering challenges and would like to help build the next generation of cloud infra, check out our careers page — we have some of the top minds in systems engineering, so we guarantee you will never be bored. Either way, let’s chat!