Yet another cluster re-install after switching back to Container Linux

Kubernetes logo

tl;dr - After a bad kernel upgrade (pacman -sYu) on my Arch-powered server I decided to go back to Container Linux, after being equal parts annoyed by Arch and encouraged by the Press release put out by red hat. This time, I spent much more time with the Ignition config files in conjunction with kubeadm and ended up with a bootable master node. Feel free to check out the tldr at the end.

UPDATE (05/25/2018)

I've recently released another post in which I go into this same setup, but with support for rkt as the container runtime and kube-router. rkt is integrated into coreos so it's pretty simple but I wrote it up anyway.


UPDATE (05/21/2018)

I was going through getting my Rook cluster back up and running on the new cluster, and realized that I forgot about the custom flexvolume directory specification required for Container Linux. This exposed an issue with my systemd configuration file that gets overridden (/etc/systemd/system/kubelet.service.d/0-container.conf) -- I was missing an ending `"`! The ignition config has been updated in this post to fix the typo. In addition, it looks like the remote runtime mechanism has actually *NOT* been working (because it wasn't configured to do so) -- while it's not a strict requirements for most people, it is for me (you can just take it out if you don't mind using docker (which is using a containerd shim these days IIRC).

After some experimentation, I had to alter the default containerd.service that comes with Container Linux, to ensure it uses containerd straight, instead of the default one included in Container liunx. The ignition.yaml required an update to achieve this -- basically the problme is that while I was downloading and installing a new containerd binary, it had to be installed to /opt/bin due /usr/local not being writable in CoreOS. The systemd.service carried on expecting /usr/local/bin/containerd to contain the proper binary (which it does, but it's the Container Linux docker-shim version). This caused issues with kubelet because in Kubernetes 1.10.x only the newer containerd works. Since I'm pretty committed to at least providing a configuration that worked as expected at one point in time, I re-rebuilt the server and can confirm that the updated code (ignition.yaml) in this works for me currently.

After the relatively recent acquisition of CoreOS by Red Hat I switched off of Container Linux, the linux distribution maintained by CoreOS for use in “cloud-native” heavily containerized environments. I (prematurely) thought that Red Hat was going to kill it off in support of their own Atomic project, but it looks like Container Linux will live on, according to their recent press release. The premise of CoreOS is really attractive – it’s basically a distribution that is built to do not much more than run containers and be relatively secure while doing so, also with a good update mechanism to prevent updates that break the machine.

Along with this cluster install I’m also going to go from Kubernetes 1.9 -> 1.10 (partly the reason I was messing around with the cluster to begin with). After the kernel upgrade on the old machine, the next reboot was prompted with an error saying that the iptables kernel module was not found (which I could only see through a direct KVM connection, sshd couldn’t running). This is basically unacceptable – updating with pacman -Syu should not break my system on a restart, and I’m super tired of it happening (it’s happened once before). I’m a bit dissapointed because outside of this, Arch is super stable, well documented, and a relatively minimal OS without too many security holes. It moves at the pace that’s just about perfect for me, but it looks like it isn’t the best fit for me, as a lazy consumer, and especially in the context of servers as cattle. So rather than an in-place k8s upgrade (1.9 -> 1.10), it turned into a full cluster re-install – I’m pretty glad I don’t have any “production” critical workflows.

coreos + ignition + kubeadm = <3

While I was pretty frustrated to have my cluster need to be rebuilt, rebuilding the cluster from scratch allowed me to revisit how I build the cluster and learn a bit more about the ecosystem. Unfortunately, there’s a bit of context missing here that I haven’t written about yet, I actually build my cluster “the hard way” once (that blog post is still in the raw state, I haven’t released it), so this time I wanted to give kubeadm a try.

I chose kubeadm instead of other tools like kubespray or kops because it is (and has always been IIRC) the best choice for a “baremetal” installation, outside of a custom cluster install – so no Terraform or AWS related set up for me. kubeadm has the ideal interface for me - assuming the machine is set up properly initially, it’s down to kubeadm init and kubeadm join commands. Doesn’t get simpler than that.

One big change this time around was also that I chose to go with the hosted control plane. This means that all I need to set up on the server is the kubelet – and all control plane components (apiserver, controller-manager, scheduler, kube-proxy) are actually created/managed by the kubelet, by using manifests (stored @ /etc/kubernetes/manifests). This was actually the pattern I used the very first time I set up Kubernetes on a server running CoreOS using their guide (that is now mostly gone). The next time that I did it, using the hard-way guide (again, I haven’t written about this yet), I started the control plane components outside kubelet and managed them with systemd.

OK now let’s get into the scripts I actually used to get everything done.

Tool download script

I use a baremetal provider I really like, Hetzner (famous for their Robot Market), and they provide a “rescue” image that one can use, so the starting point is there. It’s basically like running a LiveCD, you can access the server’s file systems and whatever else is necessary to install an OS. Unfortunately they don’t support CoreOS as an installable OS right now but this just means I needed to follow the CoreOS guide for installing Container Linux to disk.

You might want to read a little about Container Linux before getting reading on – don’t worry, I’ll wait.

To get Container Linux installed to disk, we’re going to need two tools primarily – ct (the Config Transpiler) and the coreos-install script. Here’s a quick script to download these two tools:

#!/bin/bash

echo "Installing coreos-install..."
curl -sSL https://raw.githubusercontent.com/coreos/init/master/bin/coreos-install > /bin/coreos-install
chmod +x /bin/coreos-install

echo "Installing ct..."
curl -sSL https://github.com/coreos/container-linux-config-transpiler/releases/download/v0.8.0/ct-v0.8.0-x86_64-unknown-linux-gnu > /bin/ct
echo "9f9d9d9c802a6dc875067295aa9d7f44f1e66914cc992cf215dd459ee2b719fde4ebfa036bb8488cfd87ae2efafc5d767de776fe11a4661fc45e8d54385953a4  ct" | sha512sum -c || (echo "ERROR: SHA512 Hash does not match for ct v0.8.0" && exit 1)
chmod +x /bin/ct

This script could be better – a sha512sum check on coreos-install would be good, but I don’t know how often the coreos-install script changes, and the init repo from CoreOS it’s in doesn’t seem to do releases. I did include one for ct (since it’s @ v0.8.0), so that’s an example.

Ignition YAML configuration

The YAML configuration that is going to be fed into ct is pretty intense, but it completely lays out the process. On the CoreOS site, the listed latest Ignition Specification is 2.1, however if you download ct version 0.8.0 (like above) you can find the changed specification for it in the github repo.

Without any further ado, here is the (currently working) monstrosity:

# This config is meant to be consumed by the config transpiler, which will
# generate the corresponding Ignition config. Do not pass this config directly
# to instances of Container Linux.

# NOTE: This configuration is meant to work with Config Transpiler v0.8.0
# The spec is available at (https://github.com/coreos/container-linux-config-transpiler/blob/v0.8.0/doc/configuration.md)

passwd:
  users:
    - name: core
      ssh_authorized_keys:
        - ssh-rsa <REST OF SSH KEY> user@somewhere.com
        - ssh-rsa <REST OF ANOTHER SSH KEY> user@somewhere.com

systemd:
  units:
    # Docker will be configured initially but we'll be using containerd exclusively and will disable it after containerd setup
    - name: docker.service
      enabled: true

    # containerd without docker as a shim, thanks to containerd.service.d/ overrides
    - name: containerd.service
      enabled: true

    - name: k8s-install.service
      enabled: true
      contents: |
        [Install]
        WantedBy=multi-user.target

        [Unit]
        Description=k8s installation script
        Wants=network-online.target
        After=network.target network-online.target

        [Service]
        Type=oneshot
        ExecStart=/ignition/init/k8s/install.sh

    - name: cni-install.service
      enabled: true
      contents: |
        [Install]
        WantedBy=multi-user.target

        [Unit]
        Description=cni plugin installation script
        Requires=k8s-install.service
        After=k8s-install.service

        [Service]
        Type=oneshot
        ExecStart=/ignition/init/cni/install.sh

    - name: containerd-install.service
      enabled: true
      contents: |
        [Install]
        WantedBy=multi-user.target

        [Unit]
        Description=containerd installation script
        Requires=cni-install.service
        After=cni-install.service

        [Service]
        Type=oneshot
        ExecStart=/ignition/init/cri-containerd/install.sh

    - name: kubeadm-install.service
      enabled: true
      contents: |
        [Install]
        WantedBy=multi-user.target

        [Unit]
        Description=kubeadm installation script
        Requires=containerd-install.service
        After=containerd-install.service

        [Service]
        Type=oneshot
        Environment="PATH=/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin:/usr/local/sbin:/opt/bin"
        ExecStart=/ignition/init/kubeadm/kubeadm-install.sh

    - name: k8s-setup.service
      enabled: true
      contents: |
        [Install]
        WantedBy=multi-user.target

        [Unit]
        Description=kubernetes setup script
        Requires=kubeadm-install.service
        After=kubeadm-install.service

        [Service]
        Type=oneshot
        User=core
        Environment="PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/opt/bin"
        ExecStart=/ignition/init/k8s/setup.sh

storage:
  filesystems:
    - mount:
        device: /dev/disk/by-label/ROOT
        format: xfs
        wipe_filesystem: true
        label: ROOT

  files:
    - path: /opt/bin/kubeadm
      filesystem: root
      mode: 493 # 0755
      contents:
        remote:
          url: https://storage.googleapis.com/kubernetes-release/release/v1.10.2/bin/linux/amd64/kubeadm
          verification:
            hash:
              function: sha512
              sum: fc96e821fd593a212c632a6c9093143fab5817f6833ba1df1ced2ce4fb82f1ebefde71d9a898e8f9574515e9ba19e40f6ab09a907f6b1b908d7adfcf57b3bf8b

    - path: /opt/bin/kubelet
      filesystem: root
      mode: 493 # 0755
      contents:
        remote:
          url: https://storage.googleapis.com/kubernetes-release/release/v1.10.2/bin/linux/amd64/kubelet
          verification:
            hash:
              function: sha512
              sum: 5cf4bde886d832d1cc48c47aeb43768050f67fe0458a330e4702b8071567665c975ed1fe2296cba5aea95a6de0bec4b731a32525837cac24646fb0158e2c2f64

    - path: /opt/bin/kubectl
      filesystem: root
      mode: 511 # 0777
      contents:
        remote:
          url: https://storage.googleapis.com/kubernetes-release/release/v1.10.2/bin/linux/amd64/kubectl
          verification:
            hash:
              function: sha512
              sum: 38a2746ac7b87cf7969cf33ccac177e63a6a0020ac593b7d272d751889ab568ad46a60e436d2f44f3654e2b4b5b196eabf8860b3eb87368f0861e2b3eb545a80

    - path: /etc/systemd/system/kubelet.service
      filesystem: root
      mode: 420 # 0644
      contents:
        remote:
          url: https://raw.githubusercontent.com/kubernetes/kubernetes/v1.10.2/build/debs/kubelet.service
          verification:
            hash:
              function: sha512
              sum: b9ca0db34fea67dfd0654e65d3898a72997b1360c1e802cab5adc4288199c1a08423f90751757af4a7f1ff5932bfd81d3e215ce9b9d3f4efa1c04a202228adc8

    - path: /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
      filesystem: root
      mode: 420 # 0644
      contents:
        remote:
          url: https://raw.githubusercontent.com/kubernetes/kubernetes/v1.10.2/build/debs/10-kubeadm.conf
          verification:
            hash:
              function: sha512
              sum: 32cfc8e56ec6e5ba93219852a68ec5eb25938a39c3e360ea4728fc71a14710b6ff85d0d84c2663eb5297d5dc21e1fad6914d6c0a8ce3357283f0b98ad4280ef7

    - path: /ignition/init/cri-containerd/cri-containerd-1.1.0.linux-amd64.tar.gz
      filesystem: root
      mode: 420 # 0644
      contents:
        remote:
          url: https://storage.googleapis.com/cri-containerd-release/cri-containerd-1.1.0.linux-amd64.tar.gz
          verification:
            hash:
              function: sha512
              sum: c5db1b99e1155ed0dfe945bc1d53842fd015cb891f8b773c60fea73f9fe7c3e0bda755133765aa618c08765eb13dbf244affb5a1572d5a888ff4298ba3d790cf

    - path: /ignition/init/cni/cni-plugins-v0.7.1.tgz
      filesystem: root
      mode: 420 # 0644
      contents:
        remote:
          url: https://github.com/containernetworking/plugins/releases/download/v0.7.1/cni-plugins-amd64-v0.7.1.tgz
          verification:
            hash:
              function: sha512
              sum: b3b0c1cc7b65cea619bddae4c17b8b488e7e13796345c7f75e069af93d1146b90a66322be6334c4c107e8a0ccd7c6d0b859a44a6745f9b85a0239d1be9ad4ccd

    - path: /ignition/init/canal/rbac.yaml
      filesystem: root
      mode: 493 # 0755
      contents:
        remote:
          url: https://docs.projectcalico.org/v3.1/getting-started/kubernetes/installation/hosted/canal/rbac.yaml
          verification:
            hash:
              function: sha512
              sum: e045645a1f37b4974890c3e4f8505a10bbb138ed0723869d7a7bc399c449072dfd2c8c2c482d3baac9bf700b7b0cfdca122cb260e70b437fb495eb86f9f6cccc

    - path: /ignition/init/canal/canal.yaml
      filesystem: root
      mode: 493 # 0755
      contents:
        remote:
          url: https://docs.projectcalico.org/v3.1/getting-started/kubernetes/installation/hosted/canal/canal.yaml
          verification:
            hash:
              function: sha512
              sum: 5c4953d74680a2ff84349c730bba48d0d43e28b70217c59cdf736f65e198f362c25b534b686c10ed5370dbb1bca4a1326e42039ddee7eb7b0dd314cc08523b67

    - path: /ignition/init/k8s/install.sh
      filesystem: root
      mode: 480 # 740
      contents:
        inline: |
          #!/bin/bash

          # Unzip the kubernetes binaries if not already present
          test -d /opt/bin/kubeadm && echo "k8s binaries (kubeadm) already installed" && exit 0

          # NOTE: If RELEASE is updated, the SHA512 SUMs will need to be as well
          echo -e "=> Installing k8s v1.10.2"

          echo "=> Cusomizing kubelet.service..."
          sed -i "s:/usr/bin:/opt/bin:g" /etc/systemd/system/kubelet.service
          sed -i "s:/usr/bin:/opt/bin:g" /etc/systemd/system/kubelet.service.d/10-kubeadm.conf

          systemctl daemon-reload
          systemctl enable kubelet
          systemctl start kubelet

    - filesystem: root
      path: /ignition/init/cri-containerd/install.sh
      mode: 480 # 740
      contents:
        inline: |
          #!/bin/bash

          # Unzip the kubernetes binaries if not already present
          test -d /opt/containerd && echo "containerd binaries already installed" && exit 0

          VERSION=1.1.0
          echo -e "=> Installing containerd v${VERSION}"

          echo "=> Installing containerd...."
          cd /ignition/init/cri-containerd
          tar -C / -k -xzf cri-containerd-${VERSION}.linux-amd64.tar.gz

          echo "=> Copying /usr/local binaries to /opt/bin ...."
          mkdir -p /ignition/init/cri-containerd/unzipped
          tar -C unzipped -k -xzf cri-containerd-${VERSION}.linux-amd64.tar.gz
          cp -r unzipped/usr/local/bin/* /opt/bin
          systemctl start containerd

          echo "=> Adding dropins...."
          cat > /etc/systemd/system/kubelet.service.d/0-containerd.conf <<EOF
          [Service]
          Environment="KUBELET_EXTRA_ARGS=--container-runtime=remote --runtime-request-timeout=15m --container-runtime-endpoint=unix:///run/containerd/containerd.sock --volume-plugin-dir=/var/lib/kubelet/volumeplugins"
          EOF

          mkdir -p /etc/systemd/system/containerd.service.d/
          cat > /etc/systemd/system/containerd.service.d/0-direct-containerd.conf <<EOF
          [Service]
          ExecStart=
          ExecStart=/opt/bin/containerd
          EOF

          echo "=> Triggering systemctl daemon-reload...."
          systemctl daemon-reload
          systemctl restart containerd

    - filesystem: root
      path: /ignition/init/cni/install.sh
      mode: 480 # 740
      contents:
        inline: |
          #!/bin/bash

          # Unzip the kubernetes binaries if not already present
          test -d /opt/cni/bin && echo "CNI binaries already installed" && exit 0

          VERSION=0.7.1
          echo -e "=> Installing CNI (v${VERSION}) binaries to /opt/cni/bin"
          cd /ignition/init/cni
          mkdir -p /opt/cni/bin
          tar -C /opt/cni/bin -k -xzf cni-plugins-v${VERSION}.tgz

    - filesystem: root
      path: /ignition/init/kubeadm/kubeadm-install.sh
      mode: 480 # 740
      contents:
        inline: |
          #!/bin/bash

          # Ensure kubeadm binary is present
          test -f /opt/bin/kubeadm || (echo "Failed to find kubeadm binary" && exit 1)

          # Exit if kubeadm has already been run (/etc/kubernetes folder would have been created)
          test -d /etc/kubernetes && echo "/etc/kubernetes is present, kubeadm should have already been run once" && exit 0

          echo "=> Running kubeadm init..."
          /opt/bin/kubeadm init --cri-socket "/run/containerd/containerd.sock" --pod-network-cidr "10.244.0.0/16"

          # Disable docker (kubelet will use containerd runtime directly)
          sudo systemctl stop docker
          sudo systemctl disable docker

          echo "=> Running kubeadm post-install set up for user 'core'"
          mkdir -p /home/core/.kube
          cp -i /etc/kubernetes/admin.conf /home/core/.kube/config
          chown $(id -u core):$(id -g core) /home/core/.kube/config

    - filesystem: root
      path: /ignition/init/k8s/setup.sh
      mode: 493 # 0755
      contents:
        inline: |
          #!/bin/bash

          # Ensure /etc/kubernetes is present (created by kubeadm)
          test -d /etc/kubernetes || (echo "/etc/kubernetes not present, ensure kubeadm has run properly" && exit 1)

          echo "=> Enabling workload running on the master node"
          kubectl taint nodes --all node-role.kubernetes.io/master-

          echo "=> Installing canal"
          kubectl apply -f /ignition/init/canal/rbac.yaml
          kubectl apply -f /ignition/init/canal/canal.yaml

This is a lot to take in, and is pretty hacky, but it worked for me, and I didn’t have to invest too much in learning another tool. This kind of setup is definitely best automated with some tool like Ansible, but since I have to work with this file from the rescue OS I’m leaving it as-is. Currently there doesn’t seem to be a way to submit an image/userdata currently for baremetal servers on Hetzner – you have to request a KVM to load an ISO, so jamming everything into this file is good enough for me. Terraform would also have been a good candidate here, but their support for baremetal providers was somewhat lacking last time I checked (maybe that’s changed these days).

Astute readers might have also noticed that I moved off of using kube-router – I had some problems getting it set up properly in this setup with kubeadm and got frustrated, enough to just go with [canal]canal instead, which I’ve used in the past. I don’t really have good descriptions of the crashes I was running into (generally kube-router just crashing with no output and restarting despite being on the highest verbosity level --v=3), but I’ll probably give kube-router another try sometime in the future.

The overall install process looked something like this:

  1. Boot the server into recovery mode (this specific to my baremetal provider, the rescue image is like when you boot off a LiveCD ISO)
  2. (from the installation computer) scp download-tools.sh root@<machine>:/download-tools.sh && scp master/ignition.yaml root@<machine>:/ignition.yaml
  3. (on the node) # /download-tools.sh
  4. (on the node) # ct -in-file /ignition.yaml -out-file /ignition.json
  5. (on the node) # coreos-install -d /dev/sda -i /ignition.json

Improvements

A few ways I can think of to improve upon this:

There’s probalby much more, but I just haven’t done much thinking :).

TLDR

From a recovery/LiveCD (Hetzner rescue mode) environment on my dedicated server I was able to install & initialize CoreOS (using ct and coreos-install) using Ignition in a way that sets up a Kubernetes master node with kubeadm. For my specific usecase, the master taint is removed so I can run workloads on it, and presto, I have a easy-to-boot master node. The process looks like this:

0. Boot the server into recovery mode (this specific to my baremetal provider, the rescue image is like when you boot off a LiveCD ISO)

1. Copy a script to install ct and coreos-install to the machine (see above for the download-tools script contents)

   $ scp download-tools.sh root@<machine>:/download-tools.sh && scp master/ignition.yaml root@<machine>:/ignition.yaml

2. Install the Container Linux toolchain (on the node)

$ /download-tools.sh`

3. Generate the Ignitinon JSON (on the node, check above for the YAML content)

$ ct -in-file /ignition.yaml -out-file /ignition.json`

4. Run coreos-install (on the node)

$ coreos-install -d /dev/sda -i /ignition.json`

Wrapup

It took a lot of iterations (LOTS of KVM debugging and machine reboots) to get to this point, but I’m pretty happy with it, and figured I might try and save some people some time if they happen to be doing something similar.

Hope you find the post useful!