+
tl;dr - Quick guide to getting rootless containers up and running on Arch Linux (also see the excellent Arch Wiki entry)
As usual, the Arch Wiki is a fantastic resource, and has basically everything you need, if not a little bit spread out. The relevant pages you’ll want to look at:
kernel.unprivileged_userns_clone=1 settingThis is something I think I did a long time ago when first trying to get podman working on my system (with a previous version). You can add it with sysctl:
$ sudo sysctl kernel.unprivileged_userns_clone=1
You can also make this permanent by adding it to /etc/sysctl.d/userns.conf (in a root shell):
# echo 'kernel.unprivileged_userns_clone=1' > /etc/sysctl.d/userns.conf
For me this value was already set to 1, so I assume I must have done this earlier (though I didn’t have a /etc/sysctl.d/userns.conf file).
You’ll want to enable cgroups v2 support (which is normally disabled) in the kernel as well. The Arch Wiki cgroups entry covers it very well.
I have grub on my system so I followed the grub kernel parameter instructions, meaning I edited /etc/default/grub and ran grub-mkconfig (making sure to backup /boot/grub/grub.cfg before doing anything), and added systemd.unified_cgroup_hierarchy=1 to the kernel params in GRUB_CMDLINE_LINUX_DEFAULT.
crunBefore we get into what crun we have to talk about runc – the defacto OCI-spec container runtime engine. As far as terminology goes if we consider containerd to be a runtime, then I’d consider runc to be an engine “powering” that runtime.
crun is relevant and necessary because runc actually does not support cgroups v2 (as of the writing of this post) – crun needs to be present on the machine for recent versions of podman to work.
crun is a community package and is installable the usual way:
$ sudo pacman -S crun
/etc/subuid and /etc/subgid files if they don’t existThe key feature of running rootless containers is support for linux user namespacing. All the “rootless” container solutions and products out there work the same way:
lxd (which runs lxc underneath)’s “System Containers” have had user namespacing the longestcontainerd’s rootless containersdocker’s container isolationpodman’s rootless containersIn our daily reminder that the kernel is just software – this feature requires the files /etc/subuid and /etc/subgid along with some other utilities to exist. I somehow didn’t have these files on my system, despite having binaries like the newgidmap/newuidmap binaries, so I just touch’d them into existence:
$ sudo touch /etc/subuid
$ sudo touch /etc/subgid
You can add a mapping to free up UIDs and GIDs for yourself to use by using usermod with the following syntax: usermod --add-subuids <start>-<end> --add-subgids <start>-<end> <username>. What I ran:
$ sudo touch /etc/subuid # create the subuid file (only necssary if it doesn't exist)
$ sudo touch /etc/subgid # create the subgid file (only necssary if it doesn't exist)
$ sudo usermod --add-subuids 100000-165535 --add-subgids 100000-165535 <username>
NOTE FROM THE FUTURE - You should consider using a much wider range as input to this command (for both sub-uids and sub-gids), I run into limitations on a big build in the future.
NOTE FROM THE FUTURE, 2 - I’ve updated the usermod command in the post and added some clarifying commands, it should be correct. With these commands, the /etc/subuid and /etc/subgid files should be owned by root but contain rules that are relevant to <username>
podman and alias dockerWell this is pretty obvious, just sudo pacman -S podman. After installing podman you might want to alias docker to it in your .bashrc or .bash_profile. Note that if you’re using docker in Makefiles you may need to do something like:
.PHONY: all some-target
all: some-target
DOCKER ?= podman
some-target:
$(DOCKER) subcommand ...
docker run alpineHere’s what it looks like when everything works:
$ docker run --rm alpine
Trying to pull docker.io/library/alpine...
Getting image source signatures
Copying blob df20fa9351a1 done
Copying config a24bb40132 done
Writing manifest to image destination
Storing signatures
If you want to run alpine but actually get into a shell, run docker run --rm -it alpine /bin/ash.
/etc/shadow invalid argumentI found the solution to this issue on Github in the podman repo but I’ll go into it here. The output looked like this:
$ docker run alpine
Trying to pull docker.io/library/alpine...
Getting image source signatures
Copying blob df20fa9351a1 done
Copying config a24bb40132 done
Writing manifest to image destination
Storing signatures
Error processing tar file(exit status 1): there might not be enough IDs available in the namespace (requested 0:42 for /etc/shadow): lchown /etc/shadow: invalid argument
In the end the solution was to do as instructed in the issue:
rm -r ~/.config/containers (you should add f to the -r yourself make sure you don’t accidentally rm -rf something you don’t want to)rm -r ~/.local/share/containers (you should add -f yourself)podman system migratepodman unshare cat /proc/self/uid_mapuseradd populates /etc/subuid and /etc/subgid, but not for system usersThanks to u/MachaHack on reddit for noting this:
One slightly annoying detail is that useradd will populate /etc/subuid and /etc/subgid with a reasonable automatic range (~1000 entries by default) if they exist… Unless you make the user a system user. In which case they get none and you need to calculate/add a new range manually. This isn’t documented, and the only note on it is in a bug tracker where another distributions maintainer mentioned they didn’t need it for system users as they had their own tooling for that.
I tend to run my rootless containers under their own users still to avoid issues if there’s a breakout vulnerability - a single purpose user for e.g. graylog is less of an impact than someone having access to my personal account.
So note that system users are not given a reasonable automatic range after they’re added. And the point on breakout vulnerabilities is also great – in a callback to the times where user-level isolation was the way, it might make sense to run some program like graylog with an isolated/constrained container as a constrained graylog user for added security.
With this I finally have daemon-less podman available and I can even systemctl disable the docker systemd unit (well actually I always manually start it after restarts but I digress).
Hopefully this succint guide helps someone out there to get their set up figured out!