Back to Docker After Issues With Podman

Categories

This post still working for you?

It's been a while since this was posted. Hopefully the information in here is still useful to you (if it isn't please let me know!). If you want to get the new stuff as soon as it's out though, sign up to the mailing list below.

Join the Mailing list
arch linux logo - Podman logo + docker logo

tl;dr - Turns out the tar archives created by podman didn’t work properly for my kubernetes cluster, and podman doesn’t support linking. I went back to docker (which is containerd underneath), and am a happy camper again. I also discuss some options for hosting your own registries.

After [writing about switch to podman]podman-post (and writing a follow up), I recently had to switch back to docker and enable rotoless containers there (the ever-useful Arch Wiki makes it easy). I ran into issues with podman that don’t seem like they’ll be solved any time soon, and were showstoppers for me. Other than this I quite enjoyed usind podman, so much so that I used $(DOCKER) all over my Makefiles and swapped in podman on most every project.

First issue: pushed images with incorrect tar headers

The first issue that showed up was when I was trying to run a container image on a Kubernetes cluster I have – the tar header seemed to be invalid/unknown:

  Warning  Failed     7m19s (x4 over 7m56s)   kubelet, ubuntu-1810-cosmic-64-minimal  (combined from similar events): Error: failed to create containerd container: error unpacking image: failed to extract layer sha256:c870fc5a76c14dd81013c59f71212389dce2846076de40aff68067376763382c: mount callback failed on /var/lib/containerd/tmpmounts/containerd-mount405481515: archive/tar: invalid tar header: unknown
  Normal   Pulled     4m29s (x23 over 9m27s)  kubelet, ubuntu-1810-cosmic-64-minimal  Container image "registry.gitlab.com/project/img:1.1.0" already present on machine

(project name and image name have been replaced with project and img respectively)

This seemed to be a general error which I found lots of issues mentioning:

Trying turning it off then on again

On the k8s side I made sure the imagePullPolicy was set to Always to ensure I would always get the newest version. Unfortunately none of that helped

I restarted the local containerd systemd unit, that didn’t help.

I thought maybe the issue might have been with GitLab’s container registry, so I tried re-uploading and seeing if anything changed, and of course checked the UI to ensure that the image was there and the size I expected with the SHA hash I expected. I also cleaned out and re-pushed images to GitLab and that didn’t help either.

I pulled the image myself via a simple docker pull:

$ docker pull registry.gitlab.com/project/image/image:1.1.0

After pulling and running the image worked (using podman) just as I expected locally.

Trying to inspect the built image with dive

CORRECTION dive seems to have merged initial support for podman in 2019, I think I just didn’t give it enough flags to know that it needed to use podman. Despite the text that directly follows this correction, dive may indeed be workable with podman.

dive is a really cool tool that lets you explore images, and if the idea is that the tar header was malformed, I thought it might be a good way to check. and it would have been, except that dive requires docker:

$ dive registry.gitlab.com/project/img/img:1.1.0
Image Source: docker://registry.gitlab.com/project/img/img:1.1.0
Fetching image... (this can take a while for large images)
Handler not available locally. Trying to pull 'registry.gitlab.com/project/img/img:1.1.0'...
Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
cannot fetch image
exit status 1

So it looks like that won’t work for me, I’ll have to try other things (or switch to docker, which to be fair I ended up doing anyway).

Mucking around on the server side

Well the problem is happening on the server side after all, I figured I should try and pull the image there directly. Since I’m using containerd (without Docker) for my k8s cluster, I needed to use ctr:

$ ctr image pull registry.gitlab.com/project/img/img:1.1.0
registry.gitlab.com/project/img/img:1.1.0: resolving      |--------------------------------------|
elapsed: 0.6 s                                total:   0.0 B (0.0 B/s)
ctr: failed to resolve reference "registry.gitlab.com/project/img/img:1.1.0": failed to fetch anonymous token: unexpected status: 403 Forbidden

Woops, forgot the credentials!

$ ctr image pull -u <user>:<password> registry.gitlab.com/project/img/img:1.1.0
registry.gitlab.com/project/img/img:1.1.0:                                     resolved       |++++++++++++++++++++++++++++++++++++++|
manifest-sha256:d9bb6d483f7954d3b8138058893e80da942788233e17ec6c3145441b7e174842: done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:73a3287168d7211d2eb559e45dacc121d80212b3337dbee29c80b9d6d5ecafef:    done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:70bd3e51163fa20380a3802272093093d18e33873528140cf474e37da120597a:    done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:4779f680d1eeddd5d72ea9db3978bf2c23cd302870828f78bd16df6f3c1198c0:    done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:b536a036582a47a6728458517ae599a653cc302389cfff479d13512c678d26af:    done           |++++++++++++++++++++++++++++++++++++++|
config-sha256:1999599f460cdc61881b56f843e775858faefcd471c3ce9ec7811722710d31bd:   done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:ed5f79e7b01725f59a75e8ed125f11d26ce6663f6b5c1b51049f4158cf703b83:    done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:37d65574c739c78dd8587086e3310a212180b2bab7881fc4db467559fc9d4927:    done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:3eee30c545e47333e6fe551863f6f29c3dcd850187ae3f37c606adb991444886:    done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:b3bcfd028ff28e5ad809357978e55a0f5422c552d124e5ff4da61eb208c41ac0:    done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:8c37e6e6d6f3c3357228ac80191ffc45d49d8591da262cd3f4e9a67c5f23228d:    done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:19278a490e16892043269769e4955c4b8f4760337213b941c78a67e9a6b9c4f7:    done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:be5dcf9e205eba9cac18a8cd37d6b640072e9028aa858c83904fe6f42b0fc847:    done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:74b39f7108e315375ffd7a92c10c28dcef4224d6a604c09e4c434c8b19daf613:    done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:aba1ae48b3bb6da629b5da7fe8744cd270cb3a155dd41d7176d8de17c8840d49:    done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:4ca545ee6d5db5c1170386eeb39b2ffe3bd46e5d4a73a9acbebc805f19607eb3:    done           |++++++++++++++++++++++++++++++++++++++|
elapsed: 2.6 s                                                                    total:   0.0 B (0.0 B/s)
unpacking linux/amd64 sha256:d9bb6d483f7954d3b8138058893e80da942788233e17ec6c3145441b7e174842...
INFO[0002] apply failure, attempting cleanup             error="failed to extract layer sha256:3e207b409db364b595ba862cdc12be96dcdad8e36c59a03b7b3b61c946a5741a: mount callback failed on /var/lib/containerd/tmpmounts/containerd-mount015216570: archive/tar: invalid tar header: unknown" key="extract-12177479-7WCM sha256:3e207b409db364b595ba862cdc12be96dcdad8e36c59a03b7b3b61c946a5741a"
ctr: failed to extract layer sha256:3e207b409db364b595ba862cdc12be96dcdad8e36c59a03b7b3b61c946a5741a: mount callback failed on /var/lib/containerd/tmpmounts/containerd-mount015216570: archive/tar: invalid tar header: unknown

Well, there’s my confirmation of the issue with a different image pulling tool run directly. Failing to extract layers. The pull itself works, but clearly what it’s downloading isn’t valid. I do a double check to ensure that what I pulled shows up in ctr images ls:

$ ctr image ls
REF                                          TYPE                                       DIGEST                                                                  SIZE    PLATFORMS   LABELS
registry.gitlab.com/project/img/img:1.1.0 application/vnd.oci.image.manifest.v1+json sha256:d9bb6d483f7954d3b8138058893e80da942788233e17ec6c3145441b7e174842 1.3 GiB linux/amd64 -

OK so far so good, but what’s with this tar header issue? I did some more digging and found a few issues that were illuminating:

It looks like this is a well known issue with podman (and buildah which is vendored inside it)… This was when I decided to back to docker.

Switching back to docker, but with rootless mode enabled

At this point I was all set to go back to docker but didn’t want to lose the rootless mode setup I had going. Thanks to the ever-excellent Arch Wiki, yay (which is now deprecated in favor of paru) I was able to get started with the Docker rootless directions.

Looks like I’ll be sticking to this setup for a while (socket activation + user-systemd docker) for a while if this works…

So was the problem fixed?

Yep, switching back to docker fixed the issue:

$ ctr image pull -u <user>:<password> registry.gitlab.com/project/img/img:1.1.0
registry.gitlab.com/project/img/img:1.1.0:                                     resolved       |++++++++++++++++++++++++++++++++++++++|
manifest-sha256:8eb2bb0cb14f7ae80786ea1e00d9d8f413b31610c62c51e511d8f7cddd4d4218: done           |++++++++++++++++++++++++++++++++++++++|
config-sha256:76c693c1e554a0ef90fa6d2d3567a364b70c86acfc6450d31f56b43f439d10fd:   done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:d1e3b368a9fb613187fe68d37b3e029eb0abece3df09b95dc50db940d89a06b3:    done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:3d8fa42a0b9c40c3090bd63f6932b2507a57c4bcd7ed634e6f964f5db44c08cb:    done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:cbdbe7a5bc2a134ca8ec91be58565ec07d037386d1f1d8385412d224deafca08:    done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:4272ed33d49a5831893b88fada72351b2c43475752cad891bdcd0b1733d2d10d:    done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:3963a531e10bce023f1f71c11036eb63eb8e1f838a20d76b794e46294b20f00d:    done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:92c0928e245702c640290e59c2a261b4119ae9f698e916d8f530ef9acafbe690:    done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:bfa49af56c61fdbf8f1e0fa47f90a26f8eb9c66e12f28a230259acef684b3655:    done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:23ea0cdcf8d6cae9071778f373aaea0a4ad8f38215210530221b26dfe10a1dbe:    done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:f2ffd52523c3c53242323a6259d0775306d5d3293c223b73a25ebe45cc855cfe:    done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:7a27e63388b243ea1e750aba171f05f71fb9df23ab31dea9d049a4207fec4ae4:    done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:48a445fb9d78e4fe7356b01031e5eee9ba5a3229a24d4d06c1c0e68271f51921:    done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:767ab91270c731e043c7052d7dcbdabeb73c79144373b11e03800dcdcd6d9017:    done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:943cdb7c03e66d64640d1e8199a76133736688e0bdd5bf4c9ddc008f6938a1c5:    done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:caa3ccb6df5dc5880266ebc708bc3b69814899cc8476dc76c20b8b7979320030:    done           |++++++++++++++++++++++++++++++++++++++|
elapsed: 5.1 s                                                                    total:   0.0 B (0.0 B/s)
unpacking linux/amd64 sha256:8eb2bb0cb14f7ae80786ea1e00d9d8f413b31610c62c51e511d8f7cddd4d4218...
done

Sidebar: Future plans for hosting my own images

With the recent announcements of limitations on DockerHub (and the docker docs on the rate limit) it looks like the era of free reckless container storage on DockerHub is over (and that’s probably a good thing, thanks to Docker the company for making it possible). I am pretty happy with GitLab’s Container Registry but GitHub has one too now and AWS has entered the fray with public registries. I always like angling to make my Kubernetes clusters self-sufficient, so one thing I want to do is actually run a container service from inside the cluster itself and just push straight there.

I didn’t set this up yet but figured it might be worth jotting down some of my thoughts and the landscape.

One of the first resources I revisited (I keep a lot of bookmarks) was Alex Ellis’s guide on deploying distribution. It’s a decent guide for setting up distribution but seems a bit over-complicated. I know what I need to do is:

  • Run distribution on one or more k8s nodes
  • Expose distribution to the internet the usual way (an Ingress/IngressRoute)
  • Protect it with HTTPS so your credentials don’t get stolen

Since this seems almost too simple, there are a few more options out there that I want to take a shot at self-hosting:

Kraken/Dragonfly seem to be more stripped down infrastructure pieces, whereas harbor/quay are more batteries included, coming with their own UIs. If I find time to do a bit of compare and contrasting between these, I’ll make sure to write a post about it!

BONUS: Issues with Gitlab’s CI Runner

Container Linking isn’t a thing (for now) with podman

If you’re like me, you really like the fact that you can bring your own runners to your GitLab.com or private GitLab instance. Unfortunately one of the issues with using podman locally is that it actually doesn’t support linking like docker does.

Linking doesn’t seem to be supported (yet?) by Podman and the error you’ll see looks like the following:

ERROR: Job failed (system failure): prepare environment: Error response from daemon: bad parameter: Link is not supported (docker.go:724:0s). Check https://docs.gitlab.com/runner/shells/index.html#shell-profile-loading for more information

This means for me that podman was now simply not an option. The recent changes to GitLab tiers and included amenities means that the free tier (which I wasn’t on, but I’m a bit hesitant to go up in tiers) only has 400 CI minutes (compared to 2000 previously), which means I’m going to need to rely on local runners more often, unless I switch to a bigger plan.

Switching back to docker fixed this immediately, as you might expect.

Non-standard Socket location with docker and gitlab-runner

One unfortunate result of the setup that Arch Wiki espouses is that GitLab’s docker-in-docker (dind) service doesn’t quite work out of the box. When you set up rootless docker it listens at /run/user/1000/docker.sock, which is obviously not the same as the usual root-modifiable from /var/run/docker.sock.

Turns out this is a bit of an issue, but easily solved if you just make sure to change /var/run/docker.sock to /run/user/1000/docker.sock where necessary in your GitLab Runner configuration (after it registers and connects). I had lots of other issues getting the docker-powered GitLab runner locally but I won’t go into that here.

Unfortunately for this case, the root-ful docker was actualy best for me (and “just worked” with gitlab-runner), but what I’m going to do is run gitlab-runner in a proper VM since my machine is pretty beefy.

Here’s what the local config looked like for the runner:

[[runners]]
  name = "e28403069740"
  url = "https://gitlab.com"
  token = "<your token would be here>"
  executor = "docker"
  [runners.custom_build_dir]
  [runners.cache]
    [runners.cache.s3]
    [runners.cache.gcs]
    [runners.cache.azure]
  [runners.docker]
    tls_verify = false
    image = "docker:19.03.12"
    privileged = true # required for dind
    disable_entrypoint_overwrite = false
    oom_kill_disable = false
    disable_cache = false
    volumes = ["/run/user/1000/docker.sock:/var/run/docker.sock", "/cache"]
    shm_size = 0

Don’t know if this will help anyone out there, but hopefully it does.

If you know you want to override the image used for a given service, you can use the following snippet under the runners.docker:

    [[runner.docker.services]]
      image = "docker:19.03.13-dind"
      alias = "docker"

Wrap-up

I made this post mostly out of a feeling of responsibility (because of the previous posts pushing podman), so hopefully no one ran into these issues as a result of reading my post but it’s important to at least let readers know that there are (currently) some issues with choosing podman as your daily driver.

Like what you're reading? Get it in your inbox