Zudoku
Cloud Platform

Instances

This document describes the Unikraft Cloud Instances API (v1) for managing Unikraft instances. An instance is a Unikraft virtual machine running a single instance of your app.

Instance states

An instance can be in one of the following states:

StateDescription
stoppedThe instance isn't running and doesn't count against live resource quotas. Connections can't establish.
startingThe instance is booting up. This typically takes just a few milliseconds.
runningYour app reaches its main entry point.
drainingThe instance is draining connections before shutting down. No new connections can establish.
stoppingThe instance is shutting down.
standbyThe instance has scaled-to-zero. The instance isn't running, but will be automatically started when there are incoming requests.

Unikraft Cloud reports these as instance state values via the endpoints. LSB (least significant bit) is the lowest order bit in a binary number.

Stop reason

To understand why Unikraft Cloud stopped an instance or is shutting it down, it provides information about the stop reason. You can retrieve this information via the GET /v1/instances endpoint when an instance is in the draining, stopping, stopped or standby state.

The stop_reason contains a bitmask that tells you the origin of the shutdown:

Bit4 [F]3 [U]2 [P]1 [A]0 [K] (least significant bit, LSB)
Desc.This was a force stop1Stop initiated by user2Stop initiated by platformApp exited - exit_code availableKernel exited - stop_code available

For example, the stop_reason has the following values in these scenarios:

ValueBitmaskScenario
2811100/FUP--Forced user-initiated shutdown.
1501111/-UPAKRegular user-initiated shutdown. The app and kernel have exited. The exit_code and stop_code show if the app and kernel shut down cleanly.
1301101/-UP-KThe user initiated a shutdown but the app was forcefully killed by the kernel during shutdown. This can be the case if the image doesn't support a clean app exit or the app crashed after receiving a termination signal. Unikraft Cloud ignores the exit_code in this scenario.
700111/--PAKUnikraft Cloud initiated the shutdown (for example, due to scale-to-zero). The app and kernel have exited. The exit_code and stop_code show if the app and kernel shut down cleanly.
300011/---AKThe app exited. The exit_code and stop_code show if the app and kernel shut down cleanly.
100001/----KThe instance likely experienced a fatal crash and the stop_code contains more information about the cause of the crash.
000000/-----The stop reason is unknown.

There can be a short delay of a few milliseconds between the instance reaching the stopped state and Unikraft Cloud updating the stop_reason (or vice versa).

Exitcode

The app exit code is what the app returns upon leaving its main entry point. The encoding of the exit_code is app specific. See the documentation of the app for more details. An exit_code of 0 indicates success or no failure.

Stopcode

Unikraft Cloud defines the stop_code by the kernel and has the following encoding irrespective of the app.

Bits31 - 24 (8 bits)23 - 16 (8 bits)15 [T]14 - 8 (7 bits)7 - 0 (8 bits)
Desc.Reserved3errnoshutdown_bitinitlvlreason

Reason

The reason can be any of the following values:

ValueSymbolScenario
0OKSuccessful shutdown.
1EXPThe system detected an invalid state and actively stopped execution to prevent data corruption.
2MATHAn arithmetic CPU error (for example, division by zero).
3INVLOPInvalid CPU instruction or instruction error (for example, wrong operand alignment).
4PGFAULTPage fault - see errno for further details.
5SEGFAULTSegmentation fault.
6HWERRHardware error.
7SECERRSecurity violation (for example, violation of memory access protections).

A reason of 0 indicates a clean shutdown. Ignore the other bits of stop_code when checking for a crash.

Init level

initlvl indicates the initialization or shutdown phase at stop time. A level of 127 means the instance was executing the app.

Shutdown bit

shutdown_bit indicates the system was shutting down.

Error number

errno is a Linux error code number that provides more detail about the root cause.

For example, an out-of-memory (OOM) situation triggers a page fault PGFAULT(4) with errno set to ENOMEM(12). In that case the stop_code is 0x000C7F04=818948 and the stop_reason is ----K(1) if the stop occurred during app execution.

Restart policy

When an instance stops because the app exits or crashes, Unikraft Cloud can restart it automatically according to the restart policy. The policy can have the following values:

PolicyDescription
neverNever restart the instance (default).
alwaysAlways restart the instance when Unikraft Cloud initiates the stop from within the instance (that is, the app exits or the instance crashes).
on-failureOnly restart the instance if it crashes.

When an instance stops, Unikraft Cloud evaluates the stop reason and the restart policy to decide whether to restart. It uses an exponential back-off delay (immediate, 5s, 10s, 20s, 40s, 5m) to slow down restarts in tight crash loops. If an instance runs without problems for 10s, Unikraft Cloud resets the back-off delay and the restart sequence ends.

The restart.attempt value in GET /v1/instances counts restarts in the current sequence. The restart.next_at field indicates when the next restart occurs if a back-off delay is in effect.

A manual start or stop of the instance aborts the restart sequence and resets the back-off delay.

Footnotes

  1. A forced stop doesn't give the instance a chance to perform a clean shutdown. Bits 0 [K] and 1 [A] can thus never occur for forced shutdowns. As a result, there won't be an exit_code or stop_code.

  2. A stop command originating from the user travels through the platform controller. This is why bit 2 [P] will also always occur for user-initiated stops.

  3. The system sets reserved bits to 0. Ignore them.

Last modified on