Zudoku
Features

Scale-to-Zero

With conventional cloud platforms you need to keep at least one instance running at all times to be able to respond to incoming requests: performing a just-in-time cold boot is simply too time-consuming and would create a response latency of multiple seconds or worse.

This is not the case with Unikraft Cloud (UKC). Based on extremely lightweight unikernel technology, instances on UKC are able to cold boot within milliseconds, while providing the same strong, hardware-level isolation afforded by virtual machines.

Millisecond cold boots allow us to perform low-latency scale-to-zero: that is, as long as no traffic is flowing through your instance, it consumes no resources. When the next connection arrives, Unikraft Cloud takes care of transparently cold booting (can it be called cold booting if it's milliseconds?) your instance and replying -- all of that within a negligible amount of time with respect to Internet RTTs and so unbeknownst to your end users.

By default, Unikraft Cloud reduces network and cloud stack cold start time to a minimum. If you need to deploy an app whose initialization takes a while to finish (e.g., Spring Boot, Puppeteer, etc) and would still like to retain millisecond cold starts, Unikraft Cloud provides a stateful feature to deal with this; please check out this guide for more information on how to set this up.

If you add multiple instances to a service, the service will load balance traffic across all of them and UKC will ensure each instance gets woken up as needed. This differs from autoscale, in which you don't specify the number of instances -- the platform does this for you based on traffic load.

Setting it Up

Millisecond scale-to-zero is applied either applied via a label in each of the subdirectories' Kraftfile or with the --scale-to-zero flag in relevant CLI subcommands.

YAML
labels: cloud.unikraft.v1.instances/scale_to_zero.policy: "on" cloud.unikraft.v1.instances/scale_to_zero.cooldown_time_ms: 1000

The cooldown flag tells UKC how long the instance must be idle before scaling to zero. In our examples we've put in values that work for each of them so you don't have to worry about this label if you don't want to.

You can disable scale to zero either by setting the label to false, or with the --scale-to-zero=off flag.

Since UKC has scale to zero on by default, all you need to do is to start an instance normally:

Terminal
git clone https://github.com/kraft-cloud/examples cd examples/nginx/ kraft cloud deploy -0 -p 443:8080 .

This command will create the NGINX instance with scale to zero enabled:

Terminal
[] Deployed successfully! ────────── name: nginx-1a747 ────────── uuid: 66d05e09-1436-4d1f-bbe6-6dc03ae48d7a ───────── state: running ─────────── url: https://twilight-gorilla-ui5b6kwt.fra0.kraft.host ───────── image: nginx@sha256:19854a12fe97f138313cb9b4806828cae9cecf2d050077a0268d98129863f954 ───── boot time: 19.81 ms ──────── memory: 128 MiB ─────── service: twilight-gorilla-ui5b6kwt ── private fqdn: nginx-1a747.internal ──── private ip: 172.16.6.1 ────────── args: /usr/bin/nginx -c /etc/nginx/nginx.conf

Note that at first the status is listed as running in the output of the kraft cloud deploy command. Let's check the instance's status:

Terminal
kraft cloud instances list

You should see output similar to:

Terminal
NAME FQDN STATE nginx-1a747 twilight-gorilla-ui5b6kwt.fra0.kraft.host standby

Notice the state is now set to standby? At first kraft cloud deploy sets the state to running, but then UKC puts the instance immediately to sleep (more accurately, it stopped it, but it keeps state to start it again when needed).

You can also check that scale to 0 is enabled through the kraft cloud scale command:

Terminal
kraft cloud scale get twilight-gorilla-ui5b6kw

which outputs:

Terminal
uuid: 126c4ecb-4718-4a25-9f75-ac9149da9e19 name: twilight-gorilla-ui5b6kwt enabled: true min size: 0 max size: 1 warmup (ms): 1000 cooldown (ms): 1000 master: 66d05e09-1436-4d1f-bbe6-6dc03ae48d7a policies:

Note the min size (0) and max size (1) fields -- these mean that the service can scale from max 1 instance to min 0 instances, meaning that scale to 0 is enabled.

Testing Scale-to-Zero

Now let's take this out for a spin. Try using curl or your browser to see scale to 0 (well, scale to 1 in this case!) in action:

Terminal
curl https://twilight-gorilla-ui5b6kwt.fra0.kraft.host

You should get an NGINX response with no noticeable delay. For fun, try to use the following command to see if you can catch the instance's STATE field changing from standby to running

Terminal
watch --color -n 0.5 kraft cloud instance list

If you curl enough, you should see the STATE turn to a green running from time to time:

Terminal
NAME FQDN STATE nginx-1a747 twilight-gorilla-ui5b6kwt.fra0.kraft.host running

Scale to 0 can be used in conjunction with autoscale, to make sure that as you scale back down, if traffic dies completely, your last instance is removed and you're not charged for the service. For more on autoscale please see the autoscale guide

Idle Mode

Unikraft Cloud supports an additional scale to zero mode called idle mode. This mode allows apps to still be scaled down to zero even though they may have long running, established (but idle) TCP connections. In this mode, when this is the case, UKC will (1) scale the app to zero and (2) ensure that the TCP connection remains established until the app wakes back up.

You can enable this mode through the --scale-to-zero=idle flag when deploying your app, and you can find more information about this mode in the API documentation.

Learn More

Last modified on