tl;dr - I did another round of drive testing (originally I only tested OpenEBS and hostPath
), this time with some rented Hetzner machines and Ansible-powered automation. The GitLab repository isn’t ready for mass consumption yet but I’ll update here (and this tl;dr) when it is, along with the results.
The GitLab repository is up! You can skip this entire article and just go there.
NOTE: This a multi-part blog-post!
Any deployment platform (Nomad, Kubernetes, AWS) needs to give you three things generally – computer, network, and storage. Yes, they give you “tasks”, “instances”, “VPCs”, “ELBs” and all these other terms but at the core, they must provide you with a way to run the program you’ve written, with access to the network so you have the option of talking to the outside world and storage so that you have the option of writing out the results of your hard work. Today we’re going to focus on the storage bit of building cloud platforms with Kubernetes (sorry no Nomad or AWS talk today), and how my personal favorite solutions (and one newcomer) perform against one another.
The solutions that I’m going to compare are:
drbd
kernel driver)I like these solutions in the k8s landscape because they “scale” from hobbyist to enterprise. They’re not the fastest solutions out there – that would probably be hardware RAID, administered by appropriately bearded sysadmins in your local data center. What they do offer is the functionality you’d expect from a resilient cloud platform, and the ability to run that technology locally. Ceph is well known, well documented open source software used by large organizations ([if it’s good enough for CERN][ceph-cern] it’s definitely good enough for your homelab or 10/100/1000-customer SaaS company), and OpenEBS is a newcomer but introduced a cloud-first approach that was the easiest to deploy and use on Kubernetes by far. OpenEBS has been refining their solution and building exciting new solutions before operators were called operators, and are trusted by some very interesting companies – see the “social proof” on their landing page. I point out that it’s “social proof” not to malign OpenEBS – they’re used by some impressive organizations and I support them wholeheartedly, it’s amazing software that is being given away for free – but just as a tidbit for anyone who’s not aware why companies post sections like that on the landing page.
I run OpenEBS myself, but run such an old version (v0.8.0) that the documentation isn’t even listed anymore – I started writing a post on how I was going to backup and transfer my drives from one version to another with downtime by stopping containers, hooking up other containers doing some rsync
ing, etc – but I figured it was worth taking another look at the performance characteristics of the options in the space. There are ~5 ways to actually deploy OpenEBS (essentially different backends that your PVCs will run on) and Rook deserved to be tested alongside. The thing about Rook that’s made it hard to test in the past is that it needs complete control over a local disk. Since I run on “bare metal” (a confusing term which means you have full control of a physical machine, not a VM on someone’s physical machine) this isn’t hard to come by, but I almost exclusively use Hetzner and they give you two disks but pre-installed with mdadm
-administered software RAID. In the past I found out rather painfully that any installs of grub
would clobber the disassembled RAID and render the Hetzner dedicated server unbootable, so I opted to go with OpenEBS running on the undisturbed software RAID – this is what birthed the post doing a comparison with hostPath
.
LINSTOR is the newcomer this time which unlike the others I haven’t ever set up or used before now, but I think it’s worth checking out because it is built on some well-grokked F/OSS graybeard tech that is usually just quietly good and purring along powering an unreasonable amount of systems. Usually if something is widely used but derided/has some weird orders you see people complain about it a lot, if not in half-jest. It’s really hard to navigate the LINSTOR site and find your way to the code with how enterprise-ready (tm) and with how confusingly the site is laid out, but don’t let that discourage you.
OK well let’s get to it – this time I’m going to aim a bit higher for my analysis on the reproducability (i.e. proper science & engineering) side of things – the general plan is to write some code that will rent a machine from Hetzner Cloud, install our storage plugin of choice, do all the testing we want and report results in a fully automated manner.
Before we get started, there is a lot to read up on and (re-)introduce yourself to:
And for the particular technologies we’ll be using:
I’ve been familiar with these concepts for a long time so I can’t imagine what it’d be like to be thrown into this today suddenly and have to read all of it to get a firm grasp on what’s about to happen but remember there’s no shame in skimming if your eyes start to glaze over. For most of these things having a basic understanding of how they work and what pieces are involved is enough, if you’re going to get the hands-on experience with them in the near future anyway.
Before I start projects I like to organize my thoughts in to a general plan with the simple steps I need to accomplish – the idea of how we can do one set of drive testing is pretty simple, but achieving 100% automated runs is not so simple. Let’s lay out what we need for one run before things start getting really complicated:
$SERVER
)$SERVER
’s IP and use ansible
to set up the server for use with our storage plugin of choice, let’s call this $STORAGE_PLUGIN
(ex. installing iscsi
for some OpenEBS variants)$SERVER
(we’ll probably be using k0s here)$STORAGE_PLUGIN
on $SERVER
(required DaemonSet
s, Deployment
s, StorageClass
es, CRDs etc)$TEST
s on $STORAGE_PLUGIN
Job
to the repository with some good naming scheme (ex. <single-node|cluster>_<plugin>_<plugin subtype/detail>_<pvc size>.json
)Let’s spend some time going into various holes not-well-understood areas of this plan.
k0s
(as opposed to k3s
)?I really like the idea of a single binary for running Kubernetes, I think that’s actually one of the things that could be simplified about it (I’ve written about some other ideas for a competing platform), and I personally like just about all of k0s
’s choices:
kine
to get away from the hard etcd
requirement (I personally want to use Postgres but the implemention is a bit inefficient right now)containerd
as the default runtimeAnother big thing that drew to the project is the involvement of Natanael Copa – the creator of Alpine Linux, he is absolutely prolific and I remember just running into his name all over Alpine Linux package lists (which makes sense of course) and almost implicitly trusting software he’s worked on. I actually never tried to even find details about him until this post (I watched a nice interview with him), and I’m only more impressed. Alpine Linux is obviously the result of the work of lots of contributors and tireless (and tired) open source developers and I think he’s run the project well, and has been insanely generaous with his time and skills to the open source community (and the large number of corporations that rely on his distribution. I’d venture to guess that most people get their first introduction to true static binaries with Alpine (because half of the time the easiest way to get there is to just run an alpine container and do your build there). Anyway, I try not to take part in cults of personality so I’ll stop gushing there.
There’s been some criticism of k0s from competing projects and to be honest I thought it was kind of in bad taste. Binary size is such a silly measurement that when I see it come up I dismiss the person (in my head) almost offhand – I will happily give 200MB of disk space for a single binary that will turn a random server into a flexible, directable member of a resource pool. Windows installs are ~20GB (right? I haven’t checked in a while actually), a Ubuntu install needs a minimum of 1.1GB of disk space, disks are getting bigger all the time. If I really needed the space I could move to a container linux distro (ex. Flatcar linux, Fedora CoreOS) and save myself lots of space. Anyway, I digress.
In general, the landing page and documentation of k0s also speaks to me much more than the landing page and documentation for k3s. I do not need Edge, IoT, CI (???), or Embedded use cases – I just need a Kubernetes distro that is low maintenance and runs on servers in the cloud. In more specific terms the following differences do it for me versus k3s:
Yes, you can swap out or change all of these, but why bother when k0s just about all the choices the way I want from the get-go, doesn’t make choices in the places I may disagree, and the two projects are very similar. I may end up missing k3s’s helpful inclusion of many host utilities (iptables/nftables
, ethtool
, etc) but we’ll see about that.
One of the big problems with the current plan is that I’ll need to have access to one or more (as many as the machine has) raw disks. This is really only possible on Hetzner dedicated servers (AFAIK) and can’t really be done manually because every time the server is wiped it needs to be put into recovery mode and restarted again. In addition to going into recovery mode and restarting, I’m going to have to undo the software RAID for the given machine as well, to get access to the “raw performance” of the underlying drives without the added benefits of duplication.
Maybe I shouldn’t use Hetzner for this and should instead use Linode? Well actually I can’t do that either, because it looks like Linode doesn’t offer nested virtualization support – I should have realized this when the pricing page said dedicated CPU. Seems like Linode Bare Metal is still “coming soon” as well. Linode does have a way to use custom images though which is cool, but not much help if virtualization is going to be degraded.
It looks like Linode also has had some trouble with the consistency of the disk I/O on their machines:
I don’t want these issues to possibly affect the quality of the results so I’ll skip Linode for this
Digital Ocean isn’t in japan yet as far as I can tell:
Though they did just go public, so maybe they will soon. They support using custom virutal disk images for droplets which is pretty cool as well, but not going to be able to help me on this particular quest.
So it looks like in general [I|P]aaS offerings should be out – I don’t want to deal with the possibility of some provider-optimized storage solution inconsistency to threaten the benchmark. Avoiding even Hetzner Cloud in favor of Hetzner Robot should give me the best chance at unfettered access to the disks on the machines I use.
Unfortunately this means one of the first things I’m going to have to do is write Ansible roles that can automate the Rescue System and disassembly of a Hetzner machine’s software RAID, probably by performing some crafted web requests or running some puppeteer scripts. See? Things are already getting complicated!
So in the end looks like I’ll have to stick with Hetzner dedicated servers, which means I won’t be able to spin up new machines quite as easily as I thought – the machine(s) will have to be more pets than cattle.
One idea I had early on writing these notes (and trying to figure out how to run this experiment was whether or not to use loopback/virtual disks as the basis of the experimentation. It’s massively simpler to create a bunch of /dev/disks/virtual-[1-10].img
files and mount them as disks but the issue and seeming performance variability is something I just don’t want poisoning my analysis. Technically, it would poison all the analysis in the same manner but I just don’t want to have to be checking what kind of writes and sync patterns the workloads are using – I want the tests to reflect what a naive installation (i.e. most installations, at least initially) would do.
Some approaches (namely OpenEBS) also use virtual disks and the creation of a virtual disk inside a virtual disk is expected to create some artificial slowdowns that I want to avoid as well.
$TEST
?There are a lot of tests we could possibly run, here are the ones I want to make possible:
longhorn/dbench
Job
that writes out it’s resultspgbench
(Postgres is the best open source relational database out there, and it’s what I want to run 99% of the time)oltpbench
(which runs TPC-C)In the past I used dd
, iozone
, sysbench
, but this time I’m going to only use fio
(which is what longhorn/dbench
and it’s predecessor, leeliu/debnch
use). Along with simply testing the drives I want to make sure I run some tests that are somewhat close to the interesting things I tend to do with the PVCs and one of those is definitely “run databases” – in my case, Postgres whenever it’s an option or I have a choice – pgbench
and oltpbench
wil help me accomplish that strain of testing.
I’m choosing right now to have all this orchestrated with make
. There are a lot of options (scripting languages, things I’ve build in the past, etc) on how I could run these tests, but I think I can get away with the least amount of time spent building automation by writing some Makefile
scripts with the right amount of for loops and clever-but-not-too-clever $VARIABLE
s. I’m going to need to manage disparate tools (ansible
, kubernetes
, etc) which are written in different languages and Makefile
s are excellent glue when they’re not too complicated. I very much want to be able to end up with a flow like this:
$ export CLOUD=hetzner
$ export STORAGE_PLUGIN=openebs-mayastor
$ make provision k8s-install test-dbench cleanup
$SERVER
Here’s our first side quest, made necessary since we’re going to be using Hetzner dedicated machines, I need to write a reusable way to:
grub
updates for applicable OSes so the server doesn’t brickAll of this is possible via the online “Robot” interface but unfortunately not possible through the Cloud API (which is to be fair, for Hetzner Cloud). Looks like I’ll have to string together some curl
and/or some puppeteer
/playwright
scripting. Luckily for me it turns out there’s a Robot Webservice API and all I needed was to use ansible.builtin.uri
– no crazy automation here.
Here’s what the server-reset
Makefile target looks like:
server-reset: require-target
ifeq ("true",$(SKIP_SERVER_RESET))
@echo "[info] SKIP_SERVER_RESET=true, skipping server reset step..."
else
@echo "[info] resetting server @ [$(TARGET)]..."
@$(ANSIBLE_PLAYBOOK) server-reset.yml \
-i $(INVENTORY_PATH) \
--limit=$(TARGET) \
$(ANSIBLE_ARGS)
endif
Here’s the playbook, server-reset.yaml
:
#
# Playbook for resetting a (hetzner) server
#
# Before this playbook is run, the server is expected to be in a state where ansible is
# functional and usable (after fresh machine allocation or after some other run has finished), and normally accessible.
#
# The installimage bits are heavily inspired by
# https://github.com/andrelohmann/ansible-role-hetzner_installimage
#
---
- name: k8s-setup all-in-one server setup
hosts: "{{ ansible_limit | default(omit) }}"
remote_user: root
# pre-server reset ansible is expected to be usable
# (*should* also work if this playbook is run while machine is in rescue mode)
gather_facts: yes
vars:
public_key: ~/.ssh/id_rsa.pub
# host key for the machine is going to change during this playbook
# (during rescue mode entry)
# so we need to disable strict host key checking
ansible_ssh_common_args: |
-o StrictHostKeyChecking=no -o userknownhostsfile=/dev/null
known_hosts_path: ~/.ssh/known_hosts
# Hetzner install image options
hetzner_installimage_drives:
- DRIVE1 /dev/sda
- DRIVE2 /dev/sdb
hetzner_installimage_raid:
- SWRAID 0
- SWRAIDLEVEL 0 # doesn't matter if SWRAID is 0
hetzner_installimage_hostname: "{{ inventory_hostname.split('.') | first }}"
hetzner_installimage_partitions:
# We'll be running with swap even though kubernetes suggests against it.
- PART swap swap 32G
- PART /boot ext4 1G
- PART / ext4 30G
- PART /root-disk-remaining ext4 all
hetzner_installimage_image: /root/.oldroot/nfs/images/Ubuntu-2004-focal-64-minimal.tar.gz
hetzner_installimage_bootloader: grub
tasks:
#################
# Machine reset #
#################
- name: Ensure required required variables are provided
ansible.builtin.fail:
msg: "hetzner_webservice_(username|password) variables are empty/not defined"
no_log: true
when: 'item == ""'
with_items:
- "{{ hetzner_webservice_username }}"
- "{{ hetzner_webservice_password }}"
- name: Generate local SSH key fingerprint
delegate_to: localhost
shell: "ssh-keygen -E md5 -lf ~/.ssh/id_rsa.pub | cut -d' ' -f2 | cut -d':' -f2-"
register: ssh_key_fingerprint
- name: Trigger rescue mode on server
ansible.builtin.uri:
method: POST
url: "https://robot-ws.your-server.de/boot/{{ hostvars[inventory_hostname]['ansible_env'].SSH_CONNECTION.split(' ')[2] }}/rescue"
user: "{{ lookup('env', 'HETZNER_WEBSERVICE_USERNAME') | default(hetzner_webservice_username) }}"
password: "{{ lookup('env', 'HETZNER_WEBSERVICE_PASSWORD') | default(hetzner_webservice_password) }}"
status_code: [200, 409] # 409 returned when it is already set
body_format: form-urlencoded
body:
os: linux
arch: 64
authorized_key:
- "{{ ssh_key_fingerprint.stdout }}"
###############
# Rescue mode #
###############
- name: Restart server to enter rescue mode
delegate_to: localhost
ansible.builtin.uri:
method: POST
url: "https://robot-ws.your-server.de/reset/{{ hostvars[inventory_hostname]['ansible_env'].SSH_CONNECTION.split(' ')[2] }}"
user: "{{ hetzner_webservice_username }}"
password: "{{ hetzner_webservice_password }}"
status_code: [200, 503] # 503 is returned when rescue mode is already enabled
body_format: form-urlencoded
body:
type: hw # "power button"
- name: Wait for SSH to come back up (enter rescue mode) -- this will take a while
ansible.builtin.wait_for_connection:
sleep: 10 # wait 30 seconds between checks
delay: 30 # wait 30 seconds by default
timeout: 600 # 10 minutes max wait
- name: Get MOTD from rescue system shell
shell: cat /etc/motd
register: motd_output
- name: Ensure MOTD is expected value for Hetzner
fail:
msg: MOTD not similar to Hetzner rescue mode MOTD
when: motd_output.stdout.find('Welcome to the Hetzner Rescue System.') == -1
###################
# Drive discovery #
###################
- name: Automatically determine first disk device
shell: |
lsblk | grep disk | awk '{split($0, a, " "); print a[1]}' | sort | head -1 | tail -1
register: first_disk_device
- name: Print first disk
debug:
var: first_disk_device.stdout
- name: Automatically determine second disk device
shell: |
lsblk | grep disk | awk '{split($0, a, " "); print a[1]}' | sort | head -2 | tail -1
register: second_disk_device
- name: Print second disk
debug:
var: second_disk_device.stdout
when: second_disk_device.stdout != ""
- name: Set hetzner_installimage_drives for single disk ({{ first_disk_device.stdout }})
set_fact:
hetzner_installimage_drives:
- DRIVE1 /dev/{{ first_disk_device.stdout }}
- name: Set hetzner_installimage_drives for multiple disks
set_fact:
hetzner_installimage_drives:
- DRIVE1 /dev/{{ first_disk_device.stdout }}
- DRIVE2 /dev/{{ second_disk_device.stdout }}
when: second_disk_device.stdout != ""
########################
# Hetzner installimage #
########################
- name: Copy current authorized keys into file for installimage
shell: |
/usr/bin/tail -1 /root/.ssh/authorized_keys > /root/ssh-authorized-keys
- name: Create installimage utility configuration file
template:
src: installimage.j2
dest: /autosetup
owner: root
group: root
mode: 0644
- name: Run Hetzner installimage
command: |
/root/.oldroot/nfs/install/installimage -g -K /root/ssh-authorized-keys
register: installimage_result
- name: Remove server entries from known_hosts file ({{ hostname }})
tags: [ "known_hosts:delete" ]
delegate_to: localhost
ansible.builtin.known_hosts:
path: "{{ known_hosts_path }}"
name: "{{ hostname }}"
state: absent
vars:
hostname: "{{ inventory_hostname }}"
- name: Remove server entries from known_hosts file ({{ ip }})
tags: [ "known_hosts:delete" ]
delegate_to: localhost
ansible.builtin.known_hosts:
path: "{{ known_hosts_path }}"
name: "{{ ip }}"
state: absent
vars:
ip: "{{ hostvars[inventory_hostname]['ansible_env'].SSH_CONNECTION.split(' ')[2] }}"
- name: Reboot the machine to get out of rescue mode
ansible.builtin.reboot:
- name: Wait for SSH to the new server to be up
connection: local
wait_for:
host: '{{ inventory_hostname }}'
search_regex: OpenSSH
delay: 10
port: 22
- name: Add server entries from known_hosts file ({{ hostname }})
tags: [ "known_hosts:add" ]
delegate_to: localhost
ansible.builtin.known_hosts:
path: "{{ known_hosts_path }}"
name: "{{ hostname }}"
state: present
key: "{{ lookup('pipe', 'ssh-keyscan {{ hostname }}') }}"
vars:
hostname: "{{ inventory_hostname }}"
- name: Add server entries from known_hosts file ({{ ip }})
tags: [ "known_hosts:add" ]
delegate_to: localhost
ansible.builtin.known_hosts:
path: "{{ known_hosts_path }}"
name: "{{ ip }}"
state: present
key: "{{ lookup('pipe', 'ssh-keyscan {{ ip }}') }}"
vars:
ip: "{{ hostvars[inventory_hostname]['ansible_env'].SSH_CONNECTION.split(' ')[2] }}"
In a futile effort to keep this short I’m going to be listing only the name
s of tasks and including the source code where appropriate. I have left in anything that was interesting though, like the SSH hostkey setting stuff at the top and the install image parameters (huge thanks to andrelohmann/ansible-role-hetzner_installimage
). The playbook takes a bit of time to run, but it’s repeatable and automated which is great.
Since deciding to release this post in parts, since the repo isn’t ready yet I’ll just post the contents of the files in their entirety and take out the GitLab links.
In addition to just getting a fresh install of Ubuntu 20.04 on the machine, I wanted/needed to do some basic setup and a tiny bit of server hardening. Here’s a look at what those playbooks look like:
pre-ansible-setup.yaml
#
# Play for executing tasks to set up ansible on a server that doesn't
# necessarily have Python/ansible requirements installed
#
---
- name: pre-ansible setup
hosts: "{{ ansible_limit | default(omit) }}"
remote_user: root
gather_facts: no # would fail since python may not necessarily be installed
vars:
username: ubuntu
public_key: ~/.ssh/id_rsa.pub
tasks:
- name: "Set hostname ({{ generated_hostname }})"
ansible.builtin.hostname:
name: "{{ generated_hostname }}"
vars:
generated_hostname: "{{ inventory_hostname.split('.') | first }}"
#####################
# Passwordless Sudo #
#####################
- name: check for passwordless sudo
raw: "timeout 1s sudo echo 'check'"
register: passwordless_sudo_check
ignore_errors: yes
no_log: true
- name: create admin group
when: passwordless_sudo_check["rc"] != 0
raw: |
echo {{ ssh_initial_password }} | sudo -Ss &&
sudo groupadd admins --system || true
- name: add user to admin group
when: passwordless_sudo_check["rc"] != 0
raw: |
echo {{ ssh_initial_password }} | sudo -Ss &&
sudo usermod -a -G admins {{ ssh_user }}
- name: copy sudoers file, make temporary editable
when: passwordless_sudo_check["rc"] != 0
raw: |
echo {{ ssh_initial_password }} | sudo -Ss &&
sudo cp /etc/sudoers /etc/sudoers.bak &&
sudo cp /etc/sudoers /etc/sudoers.tmp &&
sudo chmod 777 /etc/sudoers.tmp
- name: add admins no passwd rule for sudoers file
when: passwordless_sudo_check["rc"] != 0
raw: |
echo {{ ssh_initial_password }} | sudo -Ss &&
sudo echo -e "\n%admins ALL=(ALL:ALL) NOPASSWD:ALL" >> /etc/sudoers.tmp &&
sudo chmod 440 /etc/sudoers.tmp
- name: check and install new sudoers
when: passwordless_sudo_check["rc"] != 0
raw: |
echo {{ ssh_initial_password }} | sudo -Ss &&
sudo visudo -q -c -f /etc/sudoers.tmp &&c
sudo cp -f /etc/sudoers.tmp /etc/sudoers
###################
# Ansible install #
###################
- name: check for installed ansible (apt)
register: ansible_check
ignore_errors: yes
no_log: true
shell: |
dpkg -s ansible
# see: https://stackoverflow.com/questions/33563425/ansible-1-9-4-failed-to-lock-apt-for-exclusive-operation
- name: (apt) Ensure apt list dir exists
when: ansible_check["rc"] != 0
file:
path: /var/lib/apt/lists/
state: directory
mode: 0755
- name: (apt) Install software-properties-common
when: ansible_check["rc"] != 0
ansible.builtin.apt:
update_cache: yes
name:
- software-properties-common
- name: (apt) Enable universe repository
become: yes
when: ansible_check["rc"] != 0
ansible.builtin.command: add-apt-repository universe
- name: apt-get install software-properties-common
when: ansible_check["rc"] != 0
ansible.builtin.apt:
name:
- software-properties-common
- name: add apt repo for ansible
when: ansible_check["rc"] != 0
shell: |
apt-add-repository -y ppa:ansible/ansible
- name: apt-get update and install ansible
when: ansible_check["rc"] != 0
ansible.builtin.apt:
update_cache: yes
name:
- ansible
post-ansible-setup.yaml
#
# Play for executing tasks to set up ansible on a server that doesn't
# necessarily have Python/ansible requirements installed
#
---
- name: post-ansible setup
hosts: "{{ ansible_limit | default(omit) }}"
remote_user: root
gather_facts: yes
tasks:
- name: Install generally useful packages
become: yes
ansible.builtin.apt:
name: "{{ packages }}"
update_cache: yes
state: present
vars:
packages:
- make
- libseccomp2
- apt-transport-https
- ufw
- name: Enable UFW, default reject
tags: [ "ufw" ]
become: yes
community.general.ufw:
state: enabled
policy: reject
- name: (ufw) Allow SSH access
tags: [ "ufw" ]
become: yes
community.general.ufw:
rule: allow
name: OpenSSH
- name: (ufw) Limit SSH
tags: [ "ufw" ]
become: yes
community.general.ufw:
rule: limit
port: ssh
proto: tcp
###################
# Default Tooling #
###################
- name: Install performance measurement tooling
ansible.builtin.apt:
name: "{{ packages }}"
update_cache: yes
state: present
vars:
packages:
- iotop
- htop
- sysstat # contains iostat and others
- unattended-upgrades
- name: Install some creature comforts
ansible.builtin.apt:
name: "{{ packages }}"
update_cache: yes
state: present
vars:
packages:
- tree
#################################
# Unatttended security upgrades #
#################################
- name: Install performance measurement tooling
ansible.builtin.apt:
name: "{{ packages }}"
update_cache: yes
state: present
vars:
packages:
- unattended-upgrades
- name: Ensure unattended upgrades service is running
ansible.builtin.systemd:
name: unattended-upgrades
state: started
- name: Install unattended upgrade config w/ blacklist and email notifications
template:
src: 50-unattended-upgrades.conf.j2
dest: /etc/apt/apt.conf.d/50-unattended-upgrades.conf
owner: root
group: root
mode: 0644
- name: Install automatic upgrades
template:
src: 20-auto-upgrades.conf.j2
dest: /etc/apt/apt.conf.d/20-auto-upgrades.conf
owner: root
group: root
mode: 0644
- name: Hold all grub-related packages
shell: |
apt-mark hold grub*
apt-mark hold grub
apt-mark hold grub-common
apt-mark hold grub2
apt-mark hold grub2-common
apt-mark hold grub-pc
apt-mark hold grub-pc-bin
The end stuff about holding back grub
-related packages is pretty important as well – if we don’t do this we will brick the machine every time we restart, I’ve posted about this before (3 years ago, wow time flies).
$STORAGE_PLUGIN
-specific system requirementsSince different packages have different requirements, there are some variations on things we need to install to run each. Since we’ve opted to use Ubuntu 20.04 the stuff in this section will be somewhat specific, and I’ll include package names here. The way I’ve gone about this is by defining a list of “supported” storage plugins:
---
- name: Storage plugin setup
hosts: "{{ ansible_limit | default(omit) }}"
remote_user: root
vars:
supported_storage_plugins:
- all
- rook-ceph-lvm
- linstor-rbd9
- openebs-mayastor
- openebs-cstor
- openebs-jiva
- openebs-localpv-hostpath
- openebs-localpv-device
- openebs-localpv-zfs
tasks:
# ... elided ... #
Then relevant tasks will scope themselves down to the ones they care about:
- name: Do a thing that is useful for only OpenEBS cStor and Jiva
when: storage_plugin == "all" or storage_plugin in target_plugins
thing:
args: that is being done
vars:
target_plugins:
- openebs-cstor
- openebs-jiva
Note that most of the dependencies will be ansible.builtin.apt
tasks, which is the point of using a distribution with nice package management and a wide ecosystem.
Here are some of those template files:
50-unattended-upgrades.conf.j2
:
Unattended-Upgrade::Allowed-Origins {
"${distro_id}:${distro_codename}";
"${distro_id}:${distro_codename}-security";
"${distro_id}ESM:${distro_codename}";
}
Unattended-Upgrade::Package-Blacklist {
"grub";
}
{% if unattended_upgrade_email is defined %}
Unattended-Upgrade::Mail "{{ unattended_upgrade_email }}";
Unattended-Upgrade::MailOnlyOnError "true";
{% endif %}
20-auto-upgrades.conf.j2
APT::Periodic::Update-Package-Lists "1";
APT::Periodic::Unattended-Upgrade "1";
APT::Periodic::AutocleanInterval "7";
installimage.j2
(again, mostly stolen from andrelohmann/ansible-role-hetzner_installimage
):
{% for drive in hetzner_installimage_drives %}
{{ drive }}
{% endfor %}
{% for raid in hetzner_installimage_raid %}
{{ raid }}
{% endfor %}
BOOTLOADER {{ hetzner_installimage_bootloader }}
HOSTNAME {{ hetzner_installimage_hostname }}
{% for partition in hetzner_installimage_partitions %}
{{ partition }}
{% endfor %}
IMAGE {{ hetzner_installimage_image }}
STORAGE_PLUGIN=rook-ceph-lvm
The dependency list for ZFS-powered Ceph via Rook is pretty short:
Here’s that expressed an an ansible task:
- name: Install LVM
when: storage_plugin == "all" or storage_plugin in target_plugins
block:
- name: Install lvm2
ansible.builtin.apt:
name: lvm2
update_cache: yes
state: present
- name: Ensure rbd kernel module is installed
community.general.modprobe:
name: rbd
state: present
vars:
target_plugins:
- rook-ceph-lvm
- linstor-rbd9
STORAGE_PLUGIN=rook-ceph-zfs
The dependency list for ZFS-powered Ceph via Rook:
zfsutils-linux
(Ubuntu 20.04 seems to only need this, no zfs
or zfs-linux
package)Expressed as an ansible task:
- name: Install ZFS
when: storage_plugin == "all" or storage_plugin in target_plugins
ansible.builtin.apt:
name:
- zfs-linux
- zfsutils-linux
update_cache: yes
state: present
vars:
target_plugins:
- rook-ceph-zfs
- openebs-localpv-zfs
- linstor-bd9
STORAGE_PLUGIN=openebs-mayastor
The dependency list for OpenEBS MayaStor:
Expressed as ansible tasks:
- name: Enable huge page support
when: storage_plugin == "all" or storage_plugin in target_plugins
block:
- name: set nr_hugepages
ansible.builtin.shell: |
echo 512 | tee /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
- name: set nr_hugepages via sysctl
ansible.builtin.shell: |
echo vm.nr_hugepages = 512 | tee -a /etc/sysctl.conf
vars:
target_plugins:
- openebs-mayastor
- name: Ensure nvme-oF over TCP support
when: storage_plugin == "all" or storage_plugin in target_plugins
community.general.modprobe:
name: nvme-tcp
state: present
vars:
target_plugins:
- openebs-mayastor
- name: Install iscsi
when: storage_plugin == "all" or storage_plugin in target_plugins
block:
- name: Install apt package
ansible.builtin.apt:
name: open-iscsi
update_cache: yes
state: present
- name: Enable iscsid
ansible.builtin.systemd:
name: iscsid
state: started
enabled: yes
vars:
target_plugins:
- openebs-mayastor
- openebs-cstor
- openebs-jiva
STORAGE_PLUGIN=openebs-cstor
The dependency list for OpenEBS cStor:
lvm2
(if you run on LVM)iscsi
(via open-isci
on Ubuntu)iscsi
has already been covered so I’ll leave it out here.
STORAGE_PLUGIN=openebs-jiva
The dependency list for OpenEBS Jiva (rancher/longhorn
under the covers):
lvm2
(if you run on LVM)iscsi
(via open-isci
on Ubuntu)iscsi
has already been covered so I’ll leave it out here as well.
STORAGE_PLUGIN=openebs-localpv-hostpath
Nothing required for hostpath that isn’t already included in Ubuntu 20.04 I think – which is what one might expect given how simple it is (in theory).
STORAGE_PLUGIN=openebs-localpv-device
Same story as openebs-localpv-hostpath
– requirements should be all present.
STORAGE_PLUGIN=openebs-localpv-zfs
The dependency list for OpenEBS LocalPVs via ZFS:
zfsutils-linux
(Ubuntu 20.04 seems to only need this, no zfs
or zfs-linux
package)No huge surprise here, you’re going to need ZFS if you want to run hostpaths built on it.
STORAGE_PLUGIN=linstor-rbd9
The dependency list for running LINSTOR:
zfsutils-linux
(LINSTOR can run on ZFS)lvm2
ppa:linbit/linbit-drbd9-stack
apt repositorydrbd-dkms
(in ppa:linbit/linbit-drbd9-stack
)drbd-utils
(in ppa:linbit/linbit-drbd9-stack
)linstor-controller
(in ppa:linbit/linbit-drbd9-stack
)linstor-satellite
(in ppa:linbit/linbit-drbd9-stack
)linstor-client
(in ppa:linbit/linbit-drbd9-stack
)The ansible you haven’t seen yet:
- name: Add drbd9 apt repositories
when: storage_plugin == "all" or storage_plugin in target_plugins
ansible.builtin.apt_repository:
repo: ppa:linbit/linbit-drbd9-stack
state: present
vars:
target_plugins:
- linstor-rbd9
- name: Install LINSTOR components
when: storage_plugin == "all" or storage_plugin in target_plugins
block:
- name: Install drbd packages
ansible.builtin.apt:
name:
- drbd-dkms
- drbd-utils
update_cache: yes
state: present
- name: Install linstor components
ansible.builtin.apt:
name:
- linstor-controller
- linstor-satellite
- linstor-client
update_cache: yes
state: present
- name: Ensure rbd kernel module is installed
community.general.modprobe:
name: rbd
state: present
vars:
target_plugins:
- linstor-rbd9
I keep it simple and just install all the pieces of linstor on every machine for easy flexibility now and in the future.
k0s
Obviously before we get started using kubectl
we’re going to need to at least install Kubernetes! Now that the machine-level dependencies for the storage plugins are done let’s actually install k8s, with k0s
.
The Makefile target k8s-install
is pretty similar to the rest of the targets:
k8s-install: require-target
@echo "[info] performing k8s install..."
$(ANSIBLE_PLAYBOOK) k8s-install.yml \
-i $(INVENTORY_PATH) \
--limit=$(TARGET) \
$(ANSIBLE_ARGS)
And the ansible code:
#
# Playbook for setting up kubernetes
#
---
- name: k8s-setup all-in-one server setup
hosts: "{{ ansible_limit | default(omit) }}"
remote_user: root
vars:
k0s_version: v0.12.0
k0s_checksum: sha256:0a3ead8f8e5f950390eeb76bd39611d1754b282536e8d9dbbaa0676550c2edbf
tasks:
- name: Populate service facts
ansible.builtin.service_facts:
- name: Download k0s
ansible.builtin.get_url:
url: |
https://github.com/k0sproject/k0s/releases/download/{{ k0s_version }}/k0s-{{ k0s_version }}-amd64
checksum: "{{ k0s_checksum }}"
mode: 0755
dest: /usr/bin/k0s
when: ansible_facts.services["k0scontroller.service"] is not defined
- name: Create /var/lib/k0s folder
ansible.builtin.file:
path: /var/lib/k0s
state: directory
when: ansible_facts.services["k0scontroller.service"] is not defined
- name: Add k0s config file
ansible.builtin.template:
src: k0s-config.yaml.j2
dest: /var/lib/k0s/config.yaml
owner: root
group: root
mode: 0644
when: ansible_facts.services["k0scontroller.service"] is not defined
- name: Install k0s
ansible.builtin.command: |
k0s install controller -c /var/lib/k0s/config.yaml --single
when: ansible_facts.services["k0scontroller.service"] is not defined
- name: Start the k0s service
ansible.builtin.systemd:
name: k0scontroller
state: started
enabled: yes
when: ansible_facts.services["k0scontroller.service"] is not defined
- name: Create worker join token (saved @ /tmp/worker-token)
shell: |
k0s token create --role=worker --expiry=168h > /tmp/worker-token
when: ansible_facts.services["k0scontroller.service"] is not defined
- name: Copy out worker token
ansible.builtin.fetch:
src: /tmp/worker-token
dest: output
- name: Copy out cluster configuration
ansible.builtin.fetch:
src: /var/lib/k0s/pki/admin.conf
dest: output
- name: Replace localhost in cluster configuration
delegate_to: localhost
ansible.builtin.replace:
path: "output/{{ inventory_hostname }}/var/lib/k0s/pki/admin.conf"
regexp: 'https://localhost:6443'
replace: "https://{{ cluster_external_address | default(inventory_hostname) }}:6443"
- name: (ufw) Allow TCP access on port 6443
tags: [ "ufw" ]
become: yes
community.general.ufw:
rule: allow
port: '6443'
proto: tcp
- name: (ufw) Allow UDP access on port 6443
tags: [ "ufw" ]
become: yes
community.general.ufw:
rule: allow
port: '6443'
proto: udp
Easy peasy – getting a Kubernetes cluster functioning has never been so easy (well technically k3s
also made it similarly easy first)! I might actually stop messing with kubeadm
and orchestrating/running it myself if it’s going to be this easy.
One thing I did have to do was spend some time configuring k0s so here’s what the template that was used above look like:
---
apiVersion: k0s.k0sproject.io/v1beta1
kind: Cluster
metadata:
name: k0s
spec:
api:
externalAddress: {{ cluster_external_address | default(inventory_hostname) }}
address: {{ hostvars[inventory_hostname]['ansible_env'].SSH_CONNECTION.split(' ')[2] }}
sans:
- {{ hostvars[inventory_hostname]['ansible_env'].SSH_CONNECTION.split(' ')[2] }}
- {{ cluster_external_address | default(inventory_hostname) }}
- {{ inventory_hostname }}
storage:
type: etcd
etcd:
peerAddress: {{ hostvars[inventory_hostname]['ansible_env'].SSH_CONNECTION.split(' ')[2] }}
network:
podCIDR: 10.244.0.0/16
serviceCIDR: 10.96.0.0/12
provider: calico
calico:
mode: vxlan
vxlanPort: 4789
vxlanVNI: 4096
mtu: 1450
wireguard: false
flexVolumeDriverPath: /usr/libexec/k0s/kubelet-plugins/volume/exec/nodeagent~uds
withWindowsNodes: false
overlay: Always
podSecurityPolicy:
defaultPolicy: 00-k0s-privileged
telemetry:
interval: 10m0s
enabled: true
installConfig:
users:
etcdUser: etcd
kineUser: kube-apiserver
konnectivityUser: konnectivity-server
kubeAPIserverUser: kube-apiserver
kubeSchedulerUser: kube-scheduler
images:
default_pull_policy: IfNotPresent
konnectivity:
image: us.gcr.io/k8s-artifacts-prod/kas-network-proxy/proxy-agent
version: v0.0.13
metricsserver:
image: gcr.io/k8s-staging-metrics-server/metrics-server
version: v0.3.7
kubeproxy:
image: k8s.gcr.io/kube-proxy
version: v1.20.5
coredns:
image: docker.io/coredns/coredns
version: 1.7.0
calico:
cni:
image: calico/cni
version: v3.16.2
flexvolume:
image: calico/pod2daemon-flexvol
version: v3.16.2
node:
image: calico/node
version: v3.16.2
kubecontrollers:
image: calico/kube-controllers
version: v3.16.2
Generally, if you’re not using the CRD/File-based setup for tools like kubeadm
or k0s
you’re missing out – it’s a nice way to use some more declarative configuration and YAML at this size/complexity (just a single configuration file for a single tool) is quite nice.
OK, now we’ve got a nice repeatable process for getting a pre-provisioned (purchased) Hetzner dedicated server to the point of running a single node kubernetes cluster with k0s
. Fully automating as we go takes much more time, but will pay off in spades once we’re doing test running (and in general for me in the future).
I can’t believe I thought this would all fit in one post originally – I’ve since split the posts up into 5 parts, and this is the end of Part 1. Hopefully parts 2-5 won’t be take too much longer (I do have some work I wanted to do), stay tuned to this space (or however you found this article) for the rest!