Features

Load Balancing

Load balancing in Unikraft Cloud is easy: as soon as you attach more than one instance to a service, Unikraft Cloud will automatically start balancing traffic between many instances. The load balancing happens based on the number of connections (in TCP mode) or requests (in HTTP mode). More on that below.

Because of load balancing, instances in a service must be of the same kind (for example, expose the same port). You can remove instances from a service at any time, and, when you do, Unikraft Cloud will immediately take the instance out of the load balancing service.

Soft and hard limits

Services have soft and hard limits for the number of concurrent requests and connections. The limits apply per instance. For HTTP services (that is, using the http handler) the system checks each individual in-flight request against the limit, but not the underlying TCP connection. For TCP services the individual open connections get counted. In the following, the term request refers to both requests and connections.

The load balancer uses the soft limit to decide when to wake up another standby instance. For example, if you set the soft limit to 5 and the service consists of 2 standby instances, one of the instances receives up to 5 concurrent requests. The 6th parallel request wakes up the second instance. If there are no more standby instances to wake up, the number of requests assigned to each instance will exceed the soft limit. The load balancer ensures that when the number of in-flight requests goes down again, instances go into standby as fast as possible.

The hard limit defines the max number of concurrent requests that an instance can handle. The load balancer will never assign more requests to a single instance. In case there are no other instances available, the load balancer blocks excess requests (that is, they're blocked and not queued).

Setup

To set load balancing up, first use kraft cloud deploy with the -p flag to create the service as part of the instance creation. For example, use NGINX as the app:


 
git clone https://github.com/unikraft-cloud/examples
cd examples/nginx/
kraft cloud deploy -p 443:8080 .

This single command will (a) create a service via the -p flag and (b) start an NGINX instance:


 
[●] Deployed successfully!
 │
 ├────────── name: nginx-8ujeu
 ├────────── uuid: d6238ac6-27d2-47b3-8a45-c6cac99fb4ef
 ├───────── state: running
 ├─────────── url: https://wandering-shape-n6mhimgn.fra.unikraft.app
 ├───────── image: nginx@sha256:269192f523dca7498423bc54676ab08e415e9c7442d1bd3d65f07ab5e50a43d
 ├───── boot time: 20.18 ms
 ├──────── memory: 128 Mi
 ├─────── service: wandering-shape-n6mhimgn
 ├── private fqdn: nginx-8ujeu.internal
 ├──── private ip: 172.16.6.7
 └────────── args: /usr/bin/nginx -c /etc/nginx/nginx.conf

With this in place, it's now time to start a second instance and attach it to the created service (in this case, named wandering-shape-n6mhimgn):


 
cd examples/nginx/
kraft cloud deploy --service wandering-shape-n6mhimgn .

If you get a prompt saying "deployment already exists: what would you like to do with the 1 existing instances," choose keep.

The command's output should look like this:

Code
 
[●] Deployed successfully!
 │
 ├────────── name: nginx-djta3
 ├────────── uuid: 06c972a6-a117-4b07-8eba-9389b4cccb42
 ├───────── state: running
 ├─────────── url: https://wandering-shape-n6mhimgn.fra.unikraft.app
 ├───────── image: nginx@sha256:c00c11a5cbd6a3020dd4d9703fbeb2a2f2aab37f18f7a0ba9c66db5a71897c3a
 ├───── boot time: 20.46 ms
 ├──────── memory: 128 MiB
 ├______─ service: wandering-shape-n6mhimgn
 ├── private fqdn: nginx-djta3.internal
 ├──── private ip: 172.16.6.3
 └────────── args: /usr/bin/nginx -c /etc/nginx/nginx.conf

Both the url and service fields in the 2 instances are the same, as they should be. To check that it worked, run the following command:


 
kraft cloud service list -o list

You should see output such as:


 
       uuid: 954b4a52-fc51-4eac-b5c9-1fc2a368a237
       name: wandering-shape-n6mhimgn
       fqdn: wandering-shape-n6mhimgn.fra.unikraft.app
   services: 443:8080/tls+http 80:443/http+redirect
  instances: 06c972a6-a117-4b07-8eba-9389b4cccb42 d6238ac6-27d2-47b3-8a45-c6cac99fb4ef
 created at: 14 minutes ago
 persistent: true

Note the two instances (their UUIDs) under the instances field. You're now load balancing across 2 NGINX instances!

Currently it's impossible to set the service's name via kraft cloud deploy. If you'd like to set the service's name, first use the kraft cloud service command to create the service. You can use the --service flag of kraft cloud deploy to attach the instance to the service, as outlined in this guide.

If you want to ensure that no existing connections get dropped when stopping an instance, use Unikraft Cloud's draining feature by stopping an instance through kraft cloud instance stop -w <TIMEOUT_MS>

Load balancing algorithm

The load balancing algorithm is a variant of least_conn. For every instance, the number of current in-flight TCP connections (if in tcp mode) or requests (if in http mode).

To select an instance, the system goes over all instances in the service. It finds the set of instances that have the least amount of in-flight requests/connections, and picks from that set.

To illustrate, imagine this scenario:

Instance ID	# in-flight conns
`i-0`	4
`i-1`	1
`i-2`	2
`i-3`	1

In this case, the algorithm would first choose instances i-1 and i-3, since they both have the least number of connections at the moment (only 1 each). After that, the algorithm would choose between these 2 instances and assign the new connection to it. For example, if it chose i-1, the next new connection would go to i-3. This is because it would now be the only instance with only 1 connection (assuming none of the connections that the other instances handle close).

Learn more

The kraft cloud command-line tool reference, and in particular the services subcommand.
Unikraft Cloud's REST API reference, and in particular the section on services.

Edit this page

Last modified on November 19, 2025

Scale-to-Zero Snapshots