Load Balancing
Load balancing in Unikraft Cloud is very easy: as soon as you attach more than one instance to a service, UKC will automatically start balancing traffic between the multiple instances. The load balancing is done based on the number of connections (in TCP mode) or requests (in HTTP mode); more on that below.
Because of load balancing, instances in a service must be of the same kind (e.g., expose the same port). You can remove instances from a service at any time, and, when you do, UKC will immediately take the instance out of the load balancing service.
To set load balancing up, first use kraft cloud deploy
with the -p
flag so
that a service is created as part of the instance creation. For example, let's
use NGINX as the app:
This single command will (a) create a service via the -p
flag and (b) start an
NGINX instance:
With this in place, it's now time to start a second instance and attach it to
the service that was just created (named, in this case,
wandering-shape-n6mhimgn
):
If you get a prompt saying "deployment already exists: what would you like to do
with the 1 existing instance(s)?", simply choose keep
.
The command's output should be similar to this one:
Code
Notice that both the url
and service
fields in the 2 instances are the same,
as they should be. To check that it worked, run the following command:
You should see output such as:
Note the two instances (their UUIDs) under the instances
field. You're now
load balancing across 2 NGINX instances!
Currently it's not possible to set the service's name via kraft cloud deploy
.
If you'd like to set the service's name, first use the kraft cloud service
command to create the service, and then use the --service
flag of kraft cloud deploy
to attach the instance being created to the service, as outlined in this
guide
If you want to make sure that no existing connections are dropped when stopping
an instance, use UKC's draining feature by stopping an instance through kraft cloud instance stop -w TIMEOUT_MS
Load Balancing Algorithm
The load balancing algorithm is a variant of least_conn
. For every instance,
we track the number of current in-flight TCP connections (if in tcp
mode) or requests (if in http
mode).
To select an instance, we go over all instances in the service and find the set of instances that have the least amount of in-flight requests/connections, and pick randomly from that set.
To illustrate, imagine we had the following scenario:
Instance ID | # in-flight conns |
---|---|
i-0 | 4 |
i-1 | 1 |
i-2 | 2 |
i-3 | 1 |
In this case, the algorithm would first choose instances i-1
and i-3
, since
they both have the least number of connections at the moment (only 1 each).
After that, the algorithm would choose randomly between these 2 instances and
assign the new connection to it. For example, if it chose i-1
, the next new
connection would go to i-3
since it'd be now the only instance with only 1
connection (assuming none of the connections that the other instances are
handling are closed).
Learn More
- The
kraft cloud
CLI reference, and in particular the services sub-command - Unikraft Cloud's REST API reference, and in particular the section on services