Stale kubeconfig breaking service to service communication with kube-router

Categories
Kubernetes logo + Kube Router logo

tl;dr - If you’re running kube-router (I run version 1.0.1), make sure to update the kubeconfig that is being used by it after credential rotations, otherwise spooky pod->service (but not pod->pod) communication issues could occur.

Recently while working on some unrelated issues, I discovered that the kubeconfig that kube-router uses can indeed be stale. I ran into some issues with service->service communication and root-caused the issue after a bunch of head scratching – the fix was to simply copy the newer kubeconfig over to the right directory on th ehost for kube-router to pick up. This wasn’t a real satisfying fix, but it certainly was enough to get me going again. I probably won’t be using kube-router for much longer, in favor of Cilium or Calico so for now it was good enough.

Debug process

Here’s the rough gist of what I did to figure out what was wrong:

  • Observe that pod->service communication wasn’t working
  • Check if there are any NetworkPolicy objects involved (and blocking the requests)
  • kubectl exec into the container and do nslookup to the service name (inthe output you should see an <service>.<namespace>.svc.cluster.local entry for the service in question)
  • kubectl exec and curlthe IP of the pod backing the service (an Endpoint of the Service) directly (you can get pod IPs via kubectl get pods -o wide)

If all the above goes well, we know at this point we know the problem is not DNS at least, and pod->pod communication is working as we expect, so the problem lies elsewhere.

This point is where I started so suspect that something with the CNI (kube-router) wasn’t working properly so I took a look at the logs of my kube-router pod and found the answer:

E0107 19:06:25.099257       1 reflector.go:205] github.com/cloudnativelabs/kube-router/vendor/k8s.io/client-go/informers/factory.go:73: Failed to list *v1.NetworkPolicy: Unauthorized
I0107 19:06:25.099907       1 reflector.go:240] Listing and watching *v1.Pod from github.com/cloudnativelabs/kube-router/vendor/k8s.io/client-go/informers/factory.go:73
E0107 19:06:25.100317       1 reflector.go:205] github.com/cloudnativelabs/kube-router/vendor/k8s.io/client-go/informers/factory.go:73: Failed to list *v1.Pod: Unauthorized
I0107 19:06:25.101082       1 reflector.go:240] Listing and watching *v1.Namespace from github.com/cloudnativelabs/kube-router/vendor/k8s.io/client-go/informers/factory.go:73
E0107 19:06:25.101501       1 reflector.go:205] github.com/cloudnativelabs/kube-router/vendor/k8s.io/client-go/informers/factory.go:73: Failed to list *v1.Namespace: Unauthorized
I0107 19:06:26.096717       1 reflector.go:240] Listing and watching *v1.Endpoints from github.com/cloudnativelabs/kube-router/vendor/k8s.io/client-go/informers/factory.go:73
E0107 19:06:26.097421       1 reflector.go:205] github.com/cloudnativelabs/kube-router/vendor/k8s.io/client-go/informers/factory.go:73: Failed to list *v1.Endpoints: Unauthorized

And there it is – one of the most common problems I run into, something trying to talk to the Kubernetes API and not being able to authenticate. Of course, I didn’t believe these messages immediately – “Why in the world would it be Unauthorized? kube-router definitely has credentials”, so I start going back and looking through my configuration for the kube-router DaemonSet (ConfigMaps, etc).

Then it dawns on me – “how does kube-router get it’s credentials again?…” – does it use a ServiceAccount? The serviceAccount seemed to be set correctly, but there’s one I didn’t consider – turns out kube-router uses a kubeconfig. This is where I found the actual problem – I have a hostPath entry that points to a file on disk, and after the credentials were rotated (during a recent cluster upgrade/change), the file actually became stale, so I updated it, and everything started working again.

I don’t particularly like how manual the solution was (copying over a file), but since I plan on moving to a different CNI in the near future it’s good enough for me, for now.