tl;dr - Setting up piwik is pretty straight forward, since I’ve gone through the trouble of setting up a database before, and piwik’s web based setup is pretty convenient. This post is the last in the pipeline that’s related to Kubernetes for a bit.
One of the most useful tools I’ve ever come across is Piwik – it’s an excellent self-hostable tool for doing web analytics like tracking visits to your website (this very site uses it as well). One of the last moves in porting all my infrastructure to Kubernetes was of course to port my Piwik data and figure out how I should be running Piwik on the new Kubernetes cluster. Piwik is a pretty robust piece of software, so I was worried that there would be a lot of moving pieces that I needed to port over.
Turns out it’s not that hard – Here’s what I did:
mysql
/mariadb
container and transfer over old dataPiwik has an extensive set of user guides which are pretty fantastic. Also check out the piwik installation documentation FAQ, it’s pretty useful. Since I had an existing Piwik instance from which I am going to be transferring data, I also needed to read up on how to move piwik from one server to another.
I actually encountered quite a bit of frustration trying to use the official mysql
docker image, and instead went with the official mariadb
image. Here’s what I found while trying to work with the mysql
container that made me switch:
Outside of these bulletpoints I didn’t write much in the notes about what frustrated me, but I’d really suggest you use mariadb
instead.
One of the great things about Kubernetes (and Docker in general) is that you can actually do tests of this process WITHOUT too much hassle, just make temporary mariadb
-based pods and fire away (and feel no guilt when the pod is torn down and none of the data is saved). At this point, I did a lot of experimentation and didn’t worry too much about getting a proper resource configuration written just yet.
NOTE You’re going to want to do the data transfer BEFORE you set up Piwik. Once the deployment is up (you can find the resource configuration later in this article), you’re going to want to do the data transfer BEFORE you kubectl port-forward
and finish the web-based setup.
General gist of how I transferred data from the old Piwik instance (basically rehashing the Piwik FAQ):
piwik_(DATE).sql.gz
file you generated from the other databsae (it’s just a gzipped SQL script)mariadb
container somehow)mysql -p
, put in the password that was specified in the container’s env (in your kubernetes resource config) and you should get the mariadb consolesource <absolute path to backup file>
- this command will create the piwik table(s) and updating things along the wayAfter I was done experimenting with ephemeral mariadb
/piwik
containers, the finalized Kubernetes resource configuration looks like this:
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: piwik
annotations:
ingress.kubernetes.io/class: "nginx"
ingress.kubernetes.io/ssl-redirect: "true"
ingress.kubernetes.io/limit-rps: "20"
spec:
tls:
- hosts:
- "piwik.example.com"
secretName: letsencrypt-certs-all
rules:
- host: "piwik.example.com"
http:
paths:
- path: "/.well-known/acme-challenge"
backend:
serviceName: letsencrypt-helper-svc
servicePort: 80
- path: "/"
backend:
serviceName: piwik
servicePort: 80
---
apiVersion: v1
kind: Service
metadata:
name: piwik
labels:
app: piwik
spec:
type: LoadBalancer
selector:
app: piwik
ports:
- name: mysql
protocol: TCP
port: 3306
- name: piwik
protocol: TCP
port: 80
---
apiVersion: batch/v2alpha1
kind: CronJob
metadata:
name: piwik-archive
spec:
schedule: "0 * * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: piwik-cron
image: piwik:apache
args:
- /bin/bash
- -c
- date; /usr/local/bin/php /var/www/html/console core:archive www-data
restartPolicy: OnFailure
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: piwik
labels:
app: piwik
spec:
replicas: 1
template:
metadata:
labels:
app: piwik
env: prod
spec:
containers:
- name: piwik-mariadb
image: mariadb:10.3.0
imagePullPolicy: IfNotPresent
args: ["--character-set-server=utf8mb4", "--collation-server=utf8mb4_unicode_ci"]
ports:
- containerPort: 3306
env:
- name: MYSQL_ROOT_PASSWORD
value: youshoulddefinitelychangethis
volumeMounts:
- name: piwik-mysql-data
mountPath: /var/lib/mysql
- name: piwik-mysql-config
mountPath: /etc/mysql/conf.d
- name: piwik
image: piwik:3.1.1-apache
imagePullPolicy: Always
ports:
- containerPort: 80
volumeMounts:
- name: piwik-config
mountPath: /var/www/html/config
volumes:
- name: piwik-config
hostPath:
path: /var/data/piwik/config
- name: piwik-mysql-data
hostPath:
path: /var/data/piwik/mysql/data
- name: piwik-mysql-config
hostPath:
path: /var/data/piwik/mysql/config.d
NOTE For the initial set up, I left out the Cron Job resource.
You’ll likely have to change bits like where you store the data and/or the mysql root password, but that configuration kubectl apply
’d correctly for me. I think I had forgotten while I was going through it but it’s useful to remember that containers in the same pod are exposed to each other as long as you use 127.0.0.1
/localhost
. I think in the past I’ve made services for the sole reason of exposing a container in a pod to another container, and that definitely isn’t necessary.
If this resource configuration works for you as it did for me, the next step is to get a feel for setting up Piwik through the web interface by port-forwarding to it with a command like kubectl port-forward <pod name> 5000:80
.
Finishing the web-based setup for Piwik is pretty easy – if you’re going to be restoring data from a previous installation definitely ensure to set the table prefix to what you used before, along with the name of the database itself.
Another important thing to note if you’re doing a transfer is that you need to set the mariadb/mysql root password to the PREVIOUS instance’s mariadb/mysql root password. As soon as the previous database’s dump is restored, the root password will change, and it will no longer be what you put in the Kubernetes resource config, after the next restart.
After clicking through the web-based setup, Piwik will either recognize that the tables it’s trying to create are already present (if you’re doing a restoration/transfer), or it will make the necessary tables.
To test, visit your piwik instance through external ingress or kubectl port-forward
, and attempt to log in and view everything.
While developing the resource configuration above, I initially started by using the version of piwik that was meant to work with NGINX and PHP-FPM.
A big limitation of Kubernetes is that you can only mount folders, NOT individual files. This meant that I spent a lot of time trying to figure out how to cleanly inject configuration into the default NGINX container. It took me more than 30 minutes and I couldn’t come up with anything that was clean (and definitely didn’t want to build my own container just for that), so I just went with the apache flavor of piwik instead, for which you can find the code for on github.
If you’re unfamiliar with Apache check it out – it’s an older very solid alternative to NGINX.
NOTE The PHP-FPM version of piwik listens on port 9000, and is looking for FCGI connections from a reverse proxy, it doesn’t serve HTTP directly. I basically abandoned this because I didn’t want to add another nginx container with custom configuration to do the appropriate redirection.
Piwik’s documentation around configuring it BEFORE it starts up (by way of some configuration files) was really hard to find (if it exists, I couldn’t find the definitive source). I had limited success dropping a config.ini.php
script in the config folder, but Piwik would tell me that it couldn’t read the file for some reason, even when it was a carbon copy of the global.ini.php
file that is created by setup afterwards.
I figured I just didn’t know enough about Piwik, and stuck to teh web-based configuration, which is worse for reproducability, but piwik is very much a pet piece of infrastructure so I didn’t kick myself over it for too long.
This is mentioned earlier in the guide but is worth a re-mention, since it really surprised me while I was going through it.
If you DID forget, and piwik is giving you errors that it can’t connect to the DB (likely after you re-deploy it once):
root
password used to be on the old serveerkubectl exec -it piwik-<gibberish> -c piwik -- /bin/bash
)/var/www/html/config/config.ini.php
Of course, to edit that file you’ll have to do a little bit, likely at the very least installing Vim or some editor(apt-get update && apt-get install vim
worked for me) – the container didn’t have nano
, or vi
, or vim
installed.
After some thinking it wasn’t hard to come up with what the resource configuration SHOULD look like for the cron job that needs to run with Piwik:
apiVersion: batch/v2alpha1
kind: CronJob
metadata:
name: piwik-archive
spec:
schedule: "0 * * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: piwik-cron
image: piwik:apache
args:
- /bin/bash
- -c
- date; /usr/local/bin/php /var/www/html/console core:archive www-data
restartPolicy: OnFailure
The next thing I realized was that I didn’t actually have Cron Jobs enabled on the Kubelet so I needed to enable it. This consisted of adding an additional runtime configuration switch (--runtime-config=batch/v2alpha1=true
) to the api server manifest (for me that resides @ /etc/kubernetes/manifests/api-server.yaml
).
I only really need Piwik for tracking hits (for now) to the various sites I operate. I really wish there was a simpler version of Piwik that just did that, and maybe used SQLite to make things even easier to move around. Maybe I’ll make something like that some day.
So all in all it was pretty easy to set up Piwik. I spent almost as much time figuring out how to transfer over the backup (and how Piwik handles it) as actually figuring out what the resource configuration was supposed to look like. That’s a great sign for the usability of a tool like Kubernetes; after paying the startup cost (to learn Kubernetes concepts/technology), I’m being paid back in spades.
The theme on this blog for a good number of posts has been Kubernetes – this is looking like it’s going to come to an end, as this is the last post that was in the pipeline related to Kubernetes for a bit. Hope you’ve enjoyed these posts anywhere near as much as I enjoyed working through and getting used to Kubernetes!