Zudoku
Cloud Platform

Troubleshooting

Although we would love everything to work out of the box, the world, the cloud, and software are more complicated than that. In this guide we show you how to debug apps on Uniktaft Cloud (UKC) so that you may fix issues, or, in the worst case, be able to provide enough information for us to do so for you.

I have a problem with my app, how do I debug it?

Your application may crash, freeze or misbehave. To inspect it, try either:

Terminal
kraft cloud inst logs <instance_name> # or kraft cloud inst get <instance_name>

This may, in certain times, provide insufficient information.

Why does my (large) filesystem build process get stuck?

In case of filesystems larger than about 800MB, the build may get stuck. The issue is caused by a problem in BuildKit, the current filesystem build component used by Unikraft Cloud.

The current workaround is to reduce the filesystem image below 800 MB; we are working to integrate another filesystem build component that gets past the BuildKit limitation.

I'm getting a gateway error, what gives?

When you query a Unikraft Cloud service via its public URL, such as below:

Terminal
curl https://frosty-bobo-zeev783o.fra0.kraft.host

you may get a Bad Gateway response:

<html> <head><title>502 Bad Gateway</title></head> <body> <center><h1>502 Bad Gateway</h1></center> <hr><center>openresty</center> </body> </html>

This is the case when you specify the wrong internal application port when deploying, for example if the application exposes port 8080 but you happen to use 443:80 instead of 443:8080.

Another error response you may get is:

<!doctype html> <html lang="en"> <head> <meta charset="UTF-8"> <meta name="viewport" content="width=device-width, user-scalable=no, initial-scale=1.0, maximum-scale=1.0, minimum-scale=1.0"> <meta http-equiv="X-UA-Compatible" content="ie=edge"> <title>Service not found</title> </head> <body> There is no service on this URL. </body> </html>

This typically happens when you connect to your app with an HTTPS client (on port 443) and the application doesn't expose that port externally. This is the case with database services such as MongoDB or MariaDB that use a different protocol/port (e.g., 27017, 3306).

To solve this, use the appropriate exposed port; this may require the use of TLS tunnel, as detailed below.

How do I connect to a non-TLS App?

Unikraft Cloud uses TLS to expose services to the outside world. However, certain applications (such as MongoDB or MariaDB) do not use TLS. In these cases, you can create a TLS tunnel via the kraft cloud tunnel, which will open a local end point (on localhost / 127.0.0.1) that will tunnel the client traffic via the TLS tunnel. For an example of this please refer to the MariaDB guide.

I'm getting a "no such file or directory" when building/deploying an image

When building / deploying an image, you may get the error below:

Terminal
E unpacking the image: opening layer: open <path_to_home_directory>/.local/share/kraftkit/runtime/oci/digests/sha256/3d30a7ba2a819aec2e73c5df07c24264d66891e926b395b9ef0f66f151db4b49: no such file or directory

Generally, this means that the local kraft cache is in an inconsistent state.

To solve this, remove the local kraft cache and local packages:

Terminal
kraft pkg rm --all rm -fr ~/.local/share/kraftkit

I've launched an app but I can't see it/list it

The most common reason is when you (and us too!) deploy an app to one metro but do a listing on a different one. To set the metro you use, you can do it for a session by an env variable:

Terminal
export UKC_METRO=<metro_name>

or for each indvidual command via the --metro flag, for example:

Terminal
kraft cloud instance list --metro <metro_name>

How can I cache the app's filesystem for faster builds?

When using a Dockerfile for the application filesystem, the commands are passed to BuildKit.

By default, an ephemeral BuildKit container is started each time you invoke kraft cloud deploy. This means that the filesystem application data is then removed, and that each deploy command will start from zero.

To prevent this, follow the instructions here:

Terminal
docker run -d --name buildkitd --privileged moby/buildkit:latest export KRAFTKIT_BUILDKIT_HOST=docker-container://buildkitd

The approach above will cache the builds in the BuildKit container filesystem. Another approach is to save the cache in a local host directory, for persistence:

Terminal
docker run -d --name buildkitd --privileged -v $HOME/.buildkit-cache:/var/lib/buildkit moby/buildkit:latest export KRAFTKIT_BUILDKIT_HOST=docker-container://buildkitd

where $HOME/.buildkit-cache is a local path on your machine where you can store caches from BuildKit itself.

What's a Kraftfile?

A Kraftfile is used by the kraft CLI tool to understand how to build and deploy your instance. In most cases, you can use the default Kraftfile found in each of our examples; but in case you're curious, here's a sample Kraftfile and a brief explanation of it:

YAMLKraftfile
spec: v0.6 runtime: python:3.12 rootfs: ./Dockerfile cmd: ["/usr/bin/python3", "/src/server.py"] labels: cloud.unikraft.v1.instances/scale_to_zero.policy: "on" cloud.unikraft.v1.instances/scale_to_zero.stateful: "true" cloud.unikraft.v1.instances/scale_to_zero.cooldown_time_ms: 1000

The runtime specifies one of the UKC runtimes (unikernels) built to run different languages and applications, in this case a Python one.

The rootfs parameter simply tells kraft to use a Dockerfile in the same directory to build the root filesystem, and the cmd parameter which command to run when deployed (although you can also specify this in the Dockerfile).

Finally the labels are there to specify run-time options; in the example above, we're enabling stateful scale to zero, and asking UKC to wait for 1 second of no requests before putting our instance to sleep.

Debugging the Build, Packaging and Pushing Steps

The kraft cloud deploy command performs several steps on your local device before actually deploying your app to UKC:

  1. Downloads the runtime image (defined in the Kraftfile) from the UKC registry.
  2. Builds the application filesystem using the Dockerfile via BuildKit.
  3. Packages the application filesystem and the runtime in an OCI image.
  4. Pushes the OCI image to the to the UKC registry under your username's namespace.

In case any of these fail or report issues you can enable kraft debugging by using the --log-level and --log-type flags:

Terminal
kraft cloud deploy --log-level debug --log-type basic [...]

You should then see debug output for the 4 steps above, similar to:

Terminal
D using metro=fra0 D using token=*** D cannot run because: no image specified deployer=image-name D using deployer=kraftfile-runtime D querying catalog local=true name=index.unikraft.io/razvan.unikraft.io/http-go-strace2:latest remote=false D found 0/66 matching packages in oci catalog D using packager=kraftfile-runtime D querying catalog arch=x86_64 local=true name=index.unikraft.io/official/base plat=kraftcloud remote=false version=latest D found 0/66 matching packages in oci catalog D querying catalog arch=x86_64 local=true name=index.unikraft.io/official/base plat=kraftcloud remote=true version=latest i found index.unikraft.io/official/base:latest-dbg (kraftcloud/x86_64) (6bd4542, 11c196e, 2.0 MB) D found 1/67 matching packages in oci catalog D saving manifest digest=sha256:6bd45428b8ecf667954cb8f829ee3f8f6f6f23fa0a0e7dc1aa27fcae39826f89 ref=index.unikraft.io/official/base:latest i building rootfs D using buildkit addr=docker-container://buildkitd version=v0.12.3

Hopefully this output will give you better insights into what the issue is; if not, please report the issue to us with the output on our Discord server.

Debugging Running Apps

The most direct way to debug apps on UKC is to use the apps's console output, which may include kernel output. To see it, after starting the UKC instance (via kraft cloud deploy or kraft cloud inst create), use:

Terminal
kraft cloud inst logs <instance_name>

In case of a crash, you'll see a full crash output:

Powered by Unikraft Telesto (0.16.2~9c264902) [ 0.066949] CRIT: [libukvmem] Cannot handle write page fault at 0x1000bb8024 (ec: 0x2): -12 [ 0.067680] CRIT: [libkvmplat] page fault handler returned error: -12 [ 0.068227] CRIT: [libkvmplat] Unhandled page fault vaddr=0x1000bb8024, error code=0x2 [ 0.068910] CRIT: [appelfloader] Unikraft crash - Telesto (0.16.2~9c264902) [ 0.069503] CRIT: [appelfloader] Thread "python3"@0x4017e4020 [ 0.069993] CRIT: [appelfloader] RIP: 0008:000000100041fdcb [ 0.070468] CRIT: [appelfloader] RSP: 0010:000000100017f760 EFLAGS: 00010206 ORIG_RAX: 0000000000000002 [ 0.071269] CRIT: [appelfloader] RAX: 0000001000bb8000 RBX: 00000004018012f0 RCX:0000000000000007 [ 0.072040] CRIT: [appelfloader] RDX: 0000000000000000 RSI: aaaaaaaaaaaaaaab RDI:00000010007bccc0 [ 0.072793] CRIT: [appelfloader] RBP: 000000000000001a R08: 0000000000000001 R09:00000010004e2b99 [ 0.073558] CRIT: [appelfloader] R10: bb9e744ce6503a27 R11: 0000000401861290 R12:00000010007bccc0 [ 0.074312] CRIT: [appelfloader] R13: 00000010007bbd48 R14: 000000100017f890 R15:0000000000000038 [ 0.075076] CRIT: [appelfloader] Stack: [ 0.075406] CRIT: [appelfloader] 100017f760 00 00 00 00 00 00 00 00 |........| [ 0.076043] CRIT: [appelfloader] 100017f768 a1 01 00 00 00 00 00 00 |........| [ 0.076680] CRIT: [appelfloader] 100017f770 48 bd 7b 00 10 00 00 00 |H.{.....| [ 0.077324] CRIT: [appelfloader] 100017f778 e0 92 7a 00 10 00 00 00 |..z.....| [ 0.077959] CRIT: [appelfloader] 100017f780 48 bd 7b 00 10 00 00 00 |H.{.....| [ 0.078596] CRIT: [appelfloader] 100017f788 90 f8 17 00 10 00 00 00 |........| [ 0.079233] CRIT: [appelfloader] 100017f790 38 00 00 00 00 00 00 00 |8.......| [ 0.079870] CRIT: [appelfloader] 100017f798 72 fc 41 00 10 00 00 00 |r.A.....| [ 0.080504] CRIT: [appelfloader] Call Trace: [ 0.080871] CRIT: [appelfloader] [0x000000100041fdcb] [ 0.081311] CRIT: [appelfloader] Bad frame pointer It looks like the instance exited fatally. To see more details about why, run: kraft cloud instance get http-python312-hb7ij

Use the recommended command for detailed output. You get something like:

Terminal
uuid: 988c3054-09c0-48ef-ac02-877f8352d93f name: http-python312-hb7ij fqdn: private fqdn: http-python312-hb7ij.internal private ip: 172.16.3.1 state: stopped created: 2024-04-18T18:24:12Z started: 2024-04-18T18:24:12Z stopped: 2024-04-18T18:24:13Z start count: 1 restart count: 0 restart attempts: next restart: restart policy: never stop origin: initiated by kernel (----k) stop reason: Out of memory. Try increasing instance's memory (see -M flag). (i127 PGFAULT ENOMEM) app exit code: image: http-python312@sha256:8b92ed612f8450de94355a0a5b8917a710fd6c902c4a7be80ffb3989b5a99023 memory: 128 MiB args: /usr/bin/python3 /src/server.py env: volumes: service: boot time: 96.03 ms up time: 166ms

This gives you a more clear reason of why the error occurred, in the stop reason line. In this case, it's because of insufficient memory.

Some times the stop reason is insufficient for debugging. The next step is to enable debug tracing for the instance.

To do that, in your application of choice in the examples repository, or in an application directory you created, update the runtime entry in the Kraftfile to reference the debug build of the image you use by adding -dbg to the name of the runtime. For example, if you want to run the http-go1.21 example with debug output, update its Kraftfile as follows:

YAMLKraftfile
spec: v0.6 runtime: base:latest-dbg rootfs: ./Dockerfile cmd: ["/server"]

That is, simply change base:latest to base:latest-dbg. Now we're ready to re-deploy:

Terminal
kraft cloud deploy --name http-go-strace -p 443:8080 .

We can now can now inspect the logs as before and get the system call tracing:

Terminal
kraft cloud inst logs http-go-strace
Terminal
[ 0.000000] Info: [libkvmplat] Unikraft Telesto (0.16.2~5b96d531) [ 0.000000] Info: [libkvmplat] Architecture: x86_64 [ 0.000000] Info: [libkvmplat] Boot loader : unknown-lxboot [ 0.000000] Info: [libkvmplat] Command line: unikraft netdev.ip="172.16.3.4/24:172.16.3.254:172.16.3.254::http-go-strace:internal" vfs.fstab="initrd0:/:extract::ramfs=2:" virtio_mmio.device=4K@0xd0001000:5 -- /server [...] epoll_ctl(0x4, 0x1, ...) = 0x0 epoll_ctl(0x4, 0x1, ...) = 0x0 getsockname(fd:3, <out>sockaddr:{sa_family=AF_0xffffffff901f0002, ...}, <ref:0xc000045a9c>10385019274329587824) = OK accept4(0x3, 0xc000045ad0, ...) = Resource temporarily unavailable (-11) nanosleep(0x100052f120, 0x0, ...) = 0x0 clock_gettime(CLOCK_MONOTONIC, <out>timespec:{tv_sec=0, tv_nsec=187025020}) = OK clock_gettime(CLOCK_MONOTONIC, <out>timespec:{tv_sec=0, tv_nsec=187640540}) = OK set_robust_list(0x10005839a0, 0x18, ...) = Function not implemented (-38) rt_sigprocmask(0x2, 0x1000583fb0, ...) = 0x0 mmap(va:0x1050000000, 67108864, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|0x4000, fd:-1, 0) = va:0x1050000000 mprotect(va:0x1050000000, 135168, PROT_READ|PROT_WRITE) = OK sigaltstack(0x0, 0x1000583158, ...) = 0x0 rt_sigprocmask(0x2, 0x1000583168, ...) = 0x0 gettid() = pid:6 epoll_pwait(0x4, 0x10001803c8, ...) = 0x0 clock_gettime(CLOCK_MONOTONIC, <out>timespec:{tv_sec=0, tv_nsec=191570733}) = OK epoll_pwait(0x4, 0x100052eb28, ...) = 0x0 nanosleep(0x100052f120, 0x0, ...) = 0x0 clock_gettime(CLOCK_MONOTONIC, <out>timespec:{tv_sec=0, tv_nsec=192861008}) = OK clock_gettime(CLOCK_MONOTONIC, <out>timespec:{tv_sec=0, tv_nsec=193478343}) = OK

This mechanism should work for all apps/runtimes and we hope will give you enough insights into what the problem is. If not, please contact us on the Discord server and provide this output so we can assist you in debugging.

While you try to debug/sort out what the issue with an app is, you can mitigate crashes by setting a restart policy. For example, you could use kraft cloud deploy --restart on-failure to have UKC automatically bring your app back on if it crashes. You can find more info on restart policies here.

Learn More

Last modified on