Stuffing both SSH and HTTPS on port 443 with stunnel, sslh, and Traefik

Categories
Traefik logo + tux the linux penguin logo

tl;dr - You can expose SSH over the same port HTTPS runs on (443), turns out you can run a combination of stunnel (in my particular case stunnel3) and sslh as sidecar containers that work together to some container that runs SSH (i.e. sshd). At the ingress layer Traefik makes this easy to pull off by providing the IngressRouteTCP CRD along with TLS passthrough. The infrastructure-as-code is up on GitLab.

Context

A lot of the exploration I’ve been doing lately has been around running cloud workloads for other people. I’ve been essentially stockpiling knowledge that enables me to create a cloud provider – one of things I think people haven’t realized is that infrastructure has gotten so easy (for 80%+ of the cases out there) that it’s brought the cost of doing things like that down to a manage-able level for small teams. In the rush to the cloud and to complexity-as-a-service, I’m pretty steadily heading in the opposite direction – learning more about the fascinating world of operations. With this idea of running a cloud provider in my head, for the last few years it feels like I’ve been slowly searching the DevOps landscape to find the tools with with the highest leverage to complexity ratio and I’ve found some local (for now) maxima:

There are lots of ideas floating around my head but I think one of the things I was really interested in was wondering all the things I could run over HTTPS. Wouldn’t it be cool to be able to ssh sshd-c449d49d7-x78p4.<environment>.<project>.<org>.provider.tld, but have your traffic go out over HTTPS instead of the usual ports (22,2222, etc)? I’m also interested in the idea of easily doing completely in-browser cloud-driven terminals into instances with SSH in the browser, without the forwarding normally associated with web based SSH. I don’t think I’ve seen that done before.

A flow for SSH-over-HTTPS

Here’s the flow I’m envisioning:

  • Some user “bob” (in organization “org” and on project “proj”) uploads a container image to be served as a new service, let’s say hello-world
    • Along with the container image, bob provides some configuration that specifies configuration & dependencies for the hello-world service
  • The hello-world service is a minimal container that contains nothing but a binary and maybe some shared libraries (let’s say it was built with a language that makes this kind of thing easy, like Rust)
  • hello-world service automatically gets made accessible (with/without authentication) by a URL like https://hello-world.staging.<project>.<org>.paas.cloud
  • Susie Sysadmin on the SRE team notices some issues with the service, and shows the failed metrics/other observability data to Bob
  • Bob Thebuilder can’t figure out what’s wrong but has left himself some crash reports in the container, so he wants to SSH in to the container to look at the file system in the container
  • Bob & Susie use an administration panel (or CLI tool) to request SSH access to an instance of the service
    • Of course, this request is authorized and audit logged, and could possibly be denied
  • An SSH container is injected
  • Bob & Suzie can continue debugging right on the web (or ssh in the terminal) pointing at the same URL, ssh.debug.hello-world.staging.<project>.<org>.paas.cloud:443 (yeah that’s a mouthful)

Some people look at SSHing into containers in any environment (staging/production) as wrong but I’m not so dogmatic about it – if the fastest and simplest way to fix something is to SSH in and maybe correct a misconfiguration live, or make some other risky change, I want to caution against it, but leave it possible. Which reminds me, I should probably chat about security.

How it (already) works

Of course, I’m not the first person to think of a flow like this, and a lot of sites already have cloud consoles and SSH-over-HTTPS is essentially a solved problem. The above would look like this, if I used purely existing technology/approaches (or at least how I think they work, I’ve never worked at the companies that do this at scale, and probably never will):

graph TB; A(Developer Computer) -->| HTTPS over WWW | B(TLS terminating Edge Proxy) B -->| HTTP | C(HTTP to SSH Proxy) C -->| SSH | E(hello-world service)

How REST-ful commands get turned into native SSH interaction is the purview of applications like Apache guacamole and libraries like github-butlerx/wetty, ajaxterm, and anyterm. That’s not really in scope for this talk except to note that they all work the same way – they run a HTTP server that works with some JS-side client code to. Visualized, this looks like:

graph LR; A(JS Code) -->| XMLHttpRequest | B(TLS terminating Edge Proxy) B(Edge Proxy) -->| HTTP | C( wetty / ajaxterm / anyterm ) C -->| SSH | D(sshd)

Security

Before we go on, a very vew choice words about security – security is hard. Generally, nothing is truly secure, but that’s no reason to not try. Doing things like this effectively multiplies the attack vectors of the things involved:

  • ssh and securing access to it
    • kinda sorta solved, basically only use certificate auth with a
    • maybe sprinkle in some fail2ban
    • anomaly detection
    • GeoIP and/or IP region restriction + monitoring
  • Kubernetes
    • Essentially the wild west since it’s so new, but has had some time to mature and definitely has a lot of eyes on it
    • Follow the CVEs
    • Update fast enough to close vulns but not fast enough to adopt new ones
    • Lots of ways to lock/limit access for workloads:
  • Traefik
    • Is the thing actually listening on your ports (on every node, if it’s a DaemonSet)
  • Golang runtime
    • Is the thing actually actually listening on your ports (traefik is written in Go)
  • sslh
    • C :)
  • The program(s) you use to run ssh and/or what responds over HTTPS
  • Something else you didn’t think of

So this is going to be a bit of a handwave, but don’t think about taking what I’m about to discuss into production without properly considering a defense-in-depth (yeah it’s a buzzword, but it’s the nicest way to say “expect your shit will be compromised”) security posture. Probably just prepare for the worst like any good gray beard sysadmin would.

How it could work

What’s interesting about this particular yak shave, anyway? Well what I think is novel is using 443 (HTTPS) for the actual routing of SSH itself. Don’t do JS things in the browser and then pass them to SSH, run a fully functional terminal in the browser emulator in the browser, and forward that to SSH directly. While I’m not sure that functionality is out there just yet there are some promising rumbles:

Anyway, today I’ll just use ssh directly from my terminal to test it. The setup as a whole has a few benefits that I’m trying to explore:

  • Keeping SSH ports closed (no need to open more ports on the underlying nodes)
  • Possibility of using SSH-in-the-browser with only a proxy on the backend (one piece of infrastructure, the AJAX-to-SSH program removed)

So it’s a subtle difference, we’re basically getting rid of only one conceptual part (and really it’s more like we’re replacing it with another less complicated part, as we’ll find out soon). This approach I think could be very easily expanded though, and at some point we could be doing lots of things over 443, accessing redis clusters, postgres, etc. Envoy is already well on the path of recognizing lots of protocols, the idea of just doing a bit of smart proxying is attractive though we won’t be using envoy today.

Imagine being able to have the following URLs:

  • primary.redis.production.<project>.<org>.paas.cloud
  • primary.pg.production.<project>.<org>.paas.cloud

And not have to do any port juggling and opening/closing or worry about overly active firewalls! Things get even better if we start going over QUIC whenever that is fully stabilized and ready for widespread production use. Anyway, let’s keep moving.

Prior art

Luckily for me, smarter people have already built the necessary tooling to make this happen! There are a few solutions out there:

NGINX

One of the most widely deployed reverse proxies, NGINX is a high performance, trusted solution in the space, and I was able to rustle up exactly one link (I guess I didn’t search very hard) about doing this:

The key idea is simply to lean on NGINX’s capabilities as a TCP stream proxy.

HAProxy

A somewhat unexpected entry in this space is HAProxy is another well known and trusted solution in the proxying space, though it mostly operates at Layer 7 (HTTP-level). I was able to easily find a few more resources out there for doing this:

sslh (we’re gonna use this)

Seemingly much less widely known, sslh (F/OSS software on GitHub) is a minimal tool that actually accomplishes this (and has a very nice README). One thing that’s interesting is that sslh can actually support openvpn and other protocols as one user opined on the mailing list. sslh passes my sniff test for high quality library:

  • Good documentation
  • Good ratio of open to closed issues
  • No recent lingering PRs (and the older ones have comments)

Since I’m using Traefik as my proxy, sslh is the easiest thing to slot into my infrastructure and also happens to be the most featureful, as far as I can tell.

Where Traefik comes in

So where does Traefik come in? Well I’m going to be using Traefik as my reverse proxy, and it’s going to forward my TCP streams – making it very easy for me to spin up the debug container that’s running sslh. Traefik, unlike some other ingress controllers has all the features I need here:

  • TCP (& UDP) routers
  • HostSNI-based proxying for TLS-enabled streams (with passthrough)

Traefik continues to be the most featureful ingress controller out there in the ecosystem as far as I’m concerned. The ergonomics are also pretty good – the CRDs aren’t so bad once you get used to them and they are implementing Gateway API support as they go. I’m not an ambassador anymore, so I’m not affiliated with Traefik the corpoation in any way (as I’ve had to disclaim in the past), so feels nice to say that’s my opinion, as unbiased as it could be (I’m obviously biased for Traefik since it’s useful and F/OSS and delivering lots of value for me).

The setup

The sshd service

So let’s get this show on the road, here are the resources we’ll be using. We’ll be standing up a sshd “service”. I’ll describe it from the “outside” in:

sshd-https.ingressroutetcp.yaml

---
kind: IngressRouteTCP
apiVersion: traefik.containo.us/v1alpha1
metadata:
  name: sshd-https
  namespace: experiments
spec:
  entryPoints:
    - websecure
  tls:
    secretName: experiments-sshd-tls
    passthrough: true
  routes:
    - kind: Rule
      match: HostSNI(`ssh-over-https.experiments.vadosware.io`)
      services:
        - name: sshd
          port: 8443

Traefik’s IngressRouteTCP custom resource does what it sounds like – they allow TCP traffic ingress into your cluster. One thing I’m going to want over the wide internet is that traffic to travel over HTTPS for the TLS (Transport Level Security) – We’ll be relying on cert-manager for that (how to set it up is out of scope, but you should probably be running this on your cluster, no matter how you provision certs):

sshd-tls.certificate.yaml

---
apiVersion: cert-manager.io/v1alpha2
kind: Certificate
metadata:
  name: sshd-tls
  namespace: experiments
spec:
  secretName: experiments-sshd-tls
  commonName: ssh-over-https.experiments.vadosware.io
  dnsNames:
    - ssh-over-https.experiments.vadosware.io
  issuerRef:
    name: letsencrypt-prod
    kind: ClusterIssuer

That IngressRouteTCP custom resource is going to let the traefik daemons (deployed as a DaemonSet, of course) listening on every node know to point tha traffic to a Kubernetes Service object (not a TraefikService, which also exists):

sshd.svc.yaml

---
apiVersion: v1
kind: Service
metadata:
  name: sshd
  namespace: experiments
  labels:
    app: sshd
spec:
  selector:
    app: sshd
  ports:
    - name: sslh
      protocol: TCP
      port: 8443

And that Service is going to work with our cluster CNI and forward that traffic to our sshd deployment:

sshd.deployment.yaml

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: sshd
  namespace: experiments
spec:
  replicas: 1
  selector:
    matchLabels:
      app: sshd
  template:
    metadata:
      labels:
        app: sshd
    spec:
      containers:
      # whoami shows web visitors some basic information about the container
      - name: whoami
        image: traefik/whoami
        args:
          - -key=/etc/tls/tls.key
          - -cert=/etc/tls/tls.crt
          - -name=ssh-over-https.experiments.vadosware.io
          - -port=443
        ports:
          - containerPort: 443
        volumeMounts:
          - name: tls
            mountPath: /etc/tls

      # sshd gives ssh visitors web access if they have the right credentials
      - name: sshd
        # 1.3.0 as of 05/07/2021
        image: panubo/sshd@sha256:b5435f6c5a667ae8c3bd7ea6571612e888515b42d5c78b3c2d0e05bd93bcb365
        imagePullPolicy: IfNotPresent
        resources:
          requests:
            cpu: 500m
            memory: 1Gi
          limits:
            cpu: 1000m
            memory: 2Gi
        env:
          - name: SSH_ENABLE_ROOT
            value: "true"
          - name: SSH_ENABLE_PASSWORD_AUTH
            value: "false"
          - name: DISABLE_SFTP
            value: "true"
        volumeMounts:
          - name: keys
            mountPath: /etc/authorized_keys

      # SSLH proxies the right streams to the right places
      - name: sslh
        # alpine:3.13.5 as of 05/07/21
        image: alpine:3.13.5
        imagePullPolicy: IfNotPresent
        command:
          - /bin/ash
          - -xc
          # This is not a great way to install sslh but
          # for some quick prototyping container... this will do
          # (https://github.com/yrutschle/sslh/pull/205)
          #
          # NOTE: transparent proxying does not work out of the box
          - |
            echo "installing sslh (very wastefully)...";
            apk add -X http://dl-cdn.alpinelinux.org/alpine/edge/testing sslh

            echo "running sslh..."
            sslh \
              --foreground \
              --verbose=2 \
              --listen=$(SSLH_LISTEN_ADDR) \
              --ssh=$(SSLH_SSH_ADDR) \
              --tls=$(SSLH_TLS_ADDR)            
        ports:
          - containerPort: 8443
        env:
          - name: SSLH_LISTEN_ADDR
            value: 0.0.0.0:8443
          - name: SSLH_SSH_ADDR
            value: localhost:22 # same-pod containers share localhost
          - name: SSLH_TLS_ADDR
            value: localhost:443 # same-pod containers share localhost

      volumes:
        - name: www-html
          configMap:
            name: www-html
        - name: keys
          configMap:
            name: sshd-authorized-keys
        - name: tls
          secret:
            secretName: experiments-sshd-tls

There are some bits left out but you should get the idea. Check out the full code listing on GitLab if you’d like.

Improving kcup to serve a single page over TLS

As for the travelers visiting that are not speaking ssh I want to serve a small page. In the example deployment above I’m using traefik/whoami but what I really want to use is a tool I wrote called kcup (which I’ve written about in the past) to serve a simple HTML5 page, but basically you could put anything there that is capable of handling TLS (since TLS will be passed through). I actually added the code to support TLS to kcup just for this post, so here’s what that looks like:

      # kcup shows web visitors a basic web page
      - name: kcup
        # kcup @ v0.2.1 as of 05/09/2021
        image: registry.gitlab.com/mrman/kcup-rust/cli@sha256:1f34f74670ce0ea711d495669a3a0f262f09a01cebd1179b93b2cf5e68464537
        resources:
          requests:
            cpu: 50m
            memory: 50Mi
          limits:
            cpu: 200m
            memory: 200Mi
        env:
          - name: HOST
            value: "127.0.0.1"
          - name: PORT
            value: "8080" # 8443 for TLS
          - name: FILE
            value: /www/index.html
          - name: TLS_KEY
            value: /etc/tls/tls.key
          - name: TLS_CERT
            value: /etc/tls/tls.crt
        volumeMounts:
          - name: tls
            mountPath: /etc/tls
          - name: www-html
            mountPath: /www

At this point all we can do is look at the nice page (whose content is read from the www-html ConfigMap) in our browser, since we’re not actually doing the SSH bit yet:

http working

sslh service

OK on to the SSH bit. We want our traffic to actually go to sslh instead of straight to the kcup container – and sslh will decipher the protocols and do the right redirection. The sslh container is already in the Deployment, and that config might work so let’s try it out:

$ ssh -i ~/.ssh/personal/id_rsa ssh-over-https.experiments.vadosware.io -p 443 -v
mrman 00:30:23 [ssh-over-https-with-traefik] $ ssh -i ~/.ssh/personal/id_rsa ssh-over-https.experiments.vadosware.io -p 443 -v
OpenSSH_8.6p1, OpenSSL 1.1.1k  25 Mar 2021
debug1: Reading configuration data /home/mrman/.ssh/config
debug1: /home/mrman/.ssh/config line 6: Applying options for *.experiments.vadosware.io
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: Connecting to ssh-over-https.experiments.vadosware.io [176.9.30.135] port 443.
debug1: Connection established.
debug1: identity file /home/mrman/.ssh/personal/id_rsa type 0
debug1: identity file /home/mrman/.ssh/personal/id_rsa-cert type -1
debug1: Local version string SSH-2.0-OpenSSH_8.6
debug1: kex_exchange_identification: banner line 0: HTTP/1.1 400 Bad Request
debug1: kex_exchange_identification: banner line 1: Content-Type: text/plain; charset=utf-8
debug1: kex_exchange_identification: banner line 2: Connection: close
debug1: kex_exchange_identification: banner line 3:
kex_exchange_identification: Connection closed by remote host
Connection closed by 176.9.30.135 port 443

Job… not done. Looks like there’s something wrong here – I’m getting HTTP as a response where SSH is expecting to get SSH messages! It looks like Traefik is actually intercepting the SSH (it never makes it back to sslh, a quick look at the logs shows no movement) and expecting HTTPS to come through.

ISSUE: Traefik needs TLS for HostSNI, but returns 400 for non-HTTPS protocol (SSH)

So here’s the first big issue, and it’s a rather subtle one – sslh is not getting consulted at all about the incoming SSL-over-443 connection, because Traefik is attempting to disassemble it and failing, and returning a 400 as a result (before hitting the backend itself). Since I need SNI (which is provided by TLS) to have a chance at hosting multiple SSH endpoints on the same machine, it looks like I have to find out a way to get past this.

Some Q/A I had with myself:

  • Q: Can SNI be bolted/hacked on on to SSH?
    • A: Nope, not really, the question itself is silly
  • Q: If we run sslh at the absolute edge, could it possibly direct traffic properly? Seems like sslh has to be at the edge
    • A: Yeah maybe, but I don’t want to do that, what am I going to do, write an ingress controller that spins up instances of sslh…?
  • Q: Has anyone else done this?
  • Q: Hmnn but that medium post (iyzico.engineering link) makes me think that if I set it up that way things might actually work – could I hide SSH in TLS twice (essentially “SSL(SSL(HTTP | SSH))” ) and then have that pass through Traefik but get unwrapped later?
    • A: Hmnn wild, terrible sounding idea, but let’s try it.

The first step in that terrible wild idea I settled on was adding ProxyCommand for my SSH connections:

Host ssh-over-https.experiments.vadosware.io
  ProxyCommand openssl s_client -quiet -servername %h -connect %h:443

But alas, the wild crazy idea doesn’t work – the connection actually gets through, and I can see it show up in the logs of sslh, but sslh still thinks the connection is TLS, so it forwards the connection to the wrong place. I need to unwrap a layer. I even tried doing something like having a second sslh instance to try and tear it apart (thinking that SSLH was tearing apart one layer at a time). Luckily for me though, there’s another way, and also basically well known at this point (so much so that it’s in the docs of sslh)!

RESOLUTION: stunnel + sslh

Turns out the resolution to this issue is actually quite simple (because other people have worked hard on it already) – using stunnel in combination with sslh, and using that ProxyCommand as my proxytunnel -e.

It works!

After resolving some issues, changing configuration, I got it working. First the server-side logs:

$ k logs -f sshd-c449d49d7-x78p4 -c sshd
[info] copying files /authorized-users to /etc/authorized_keys
[info] changing permissions on /etc/authorized_keys, /root/.ssh
[info] starting sshd...
chmod: /root/.ssh/authorized_keys: Read-only file system
> Starting SSHD
>> Generating new host keys
ssh-keygen: generating new host keys: RSA DSA ECDSA ED25519
>>> Fingerprints for dsa host key
1024 MD5:01:57:dd:be:89:ab:c3:c7:d9:d7:75:56:51:91:2d:fe root@sshd-c449d49d7-x78p4 (DSA)
1024 SHA256:da03+OK9FRph7N++bFcJyTXvbFCk+gHlgq4vX8A5/3I root@sshd-c449d49d7-x78p4 (DSA)
1024 SHA512:8gr8DwIx4afGbRV6qF6fABbrVwa8hEmoXesDPz+Sh04WuYyGZ/iPKYEltrue4pY5b8/KA/Hc6u9YZZ3HvHyLZw root@sshd-c449d49d7-x78p4 (DSA)
>>> Fingerprints for rsa host key
3072 MD5:96:c3:ec:5a:dc:60:00:7b:5d:f6:7d:aa:de:65:94:9a root@sshd-c449d49d7-x78p4 (RSA)
3072 SHA256:Pa1pi64e4v1KoGeDFLS2MUgyUwsu43LOhuMYXi4RK4k root@sshd-c449d49d7-x78p4 (RSA)
3072 SHA512:sxuEcLxoWeB1IALJe/QMIXhn7JW5lmQ1vzPcf9TLJDNbr9h2kyFl0Zboo3l6RoPHIkgKee5bFAPMOuR2D0ELcA root@sshd-c449d49d7-x78p4 (RSA)
>>> Fingerprints for ecdsa host key
256 MD5:0a:2b:83:8e:4a:92:68:d5:9f:6b:7b:62:c5:26:45:39 root@sshd-c449d49d7-x78p4 (ECDSA)
256 SHA256:P6q5KtLJpHKEp8ud4fbC+L+Pbjw6YHHdHO8X2QhEGh8 root@sshd-c449d49d7-x78p4 (ECDSA)
256 SHA512:nMqKAUd/R8dZ6jpwmZe37QeZ0LkUxRupIYpiJDxVINPJsdTlHn35xvvEjYDguivBFHo4p3crGy+UhyZWZra7IQ root@sshd-c449d49d7-x78p4 (ECDSA)
>>> Fingerprints for ed25519 host key
256 MD5:ac:bd:90:f3:5b:e5:e2:68:39:13:70:e4:28:44:9d:7a root@sshd-c449d49d7-x78p4 (ED25519)
256 SHA256:002QZIOY529lGBGCGE6liTrWK8l9sen+3+t/vz/8Xmw root@sshd-c449d49d7-x78p4 (ED25519)
256 SHA512:eY9XAlL4nvzn7N1Ph6WFSKcA/vCNNG0FWcQnCuO71uXvcNxzq2UbA+dqY0EFIeKt0Uqw0rk/Q1gTMeKR7R5glQ root@sshd-c449d49d7-x78p4 (ED25519)
>> Unlocking root account
INFO: password authentication is disabled by default. Set SSH_ENABLE_PASSWORD_AUTH=true to enable.
Running /usr/sbin/sshd -D -e -f /etc/ssh/sshd_config
Server listening on 0.0.0.0 port 22.
Server listening on :: port 22.


Accepted publickey for root from ::1 port 41448 ssh2: RSA SHA256:g6bCJRwmZFusZ/jyjl3vKB43cSZNjLpD+arXulk4wEk

On the client side, using ssh in the terminal with that custom ProxyCommand:

$ ssh -i ~/.ssh/personal/id_rsa root@ssh-over-https.experiments.vadosware.io -p 443
depth=2 C = US, O = Internet Security Research Group, CN = ISRG Root X1
verify return:1
depth=1 C = US, O = Let's Encrypt, CN = R3
verify return:1
depth=0 CN = ssh-over-https.experiments.vadosware.io
verify return:1
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@    WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!     @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
Someone could be eavesdropping on you right now (man-in-the-middle attack)!
It is also possible that a host key has just been changed.
The fingerprint for the ED25519 key sent by the remote host is
SHA256:002QZIOY529lGBGCGE6liTrWK8l9sen+3+t/vz/8Xmw.
Please contact your system administrator.
Add correct host key in /home/mrman/.ssh/known_hosts to get rid of this message.
Offending ED25519 key in /home/mrman/.ssh/known_hosts:47
Password authentication is disabled to avoid man-in-the-middle attacks.
Keyboard-interactive authentication is disabled to avoid man-in-the-middle attacks.
UpdateHostkeys is disabled because the host key is not trusted.
Welcome to Alpine!

The Alpine Wiki contains a large amount of how-to guides and general
information about administrating Alpine systems.
See <http://wiki.alpinelinux.org/>.

You can setup the system with the command: setup-alpine

You may change this message by editing /etc/motd.

sshd-c449d49d7-x78p4:~#

Beautiful – this means that with the tooling I’ve got set up in my cluster (Traefik, Cert-Manager, etc), I’m basically prepared to offer a dynamic, shared-namespace terminals over HTTPS to anyone that wants to rock up and use one (with the small caveat that they need to proxy first). It’s not as simple as I wanted – stunnel had to be included, and I ended up not needing the HTTP/TLS support I added to kcup, but all-in-all thanks to these giants I’ve got something I think is pretty cool.

Here’s how it works now:

graph TB; A(Developer Computer) -->| SSH with Proxycommand | B(TLS terminating Edge Proxy) B -->| HTTPS | C(stunnel) C -->| fork / execute | D(sslh) D -->| HTTP | E(kcup) D -->| SSH | F(sshd)

Definitely not the simplicity I was going for but it does work!

This is where I’d link you to the awesome cloud provider I built that uses this tech but I don’t have that yet, and probably won’t for years. If you’re interested in that (or other things I build/write about), get in touch or join the mailing list! Always trying to deliver more value to the mailing list members, lately releasing some posts early and probably in the future giving discounts on services I run.

For the curious: the updated deployment.yaml

Before You should check out the GitLab repo for the details, but here’s a sneak peak of what the added complexity

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: sshd
  namespace: experiments
spec:
  replicas: 1
  selector:
    matchLabels:
      app: sshd
  template:
    metadata:
      labels:
        app: sshd
    spec:
      containers:
      # SSLH proxies the right streams to the right places
      - name: sslh
        # alpine:3.13.5 as of 05/07/21
        image: alpine:3.13.5
        imagePullPolicy: IfNotPresent
        command:
          - /bin/ash
          - -xc
          # This is not a great way to install sslh but
          # for some quick prototyping container... this will do
          # (https://github.com/yrutschle/sslh/pull/205)
          #
          # NOTE: transparent proxying does not work out of the box
          - |
            echo "installing stunnel & sslh (very wastefully)...";
            apk add perl stunnel;
            apk add -X http://dl-cdn.alpinelinux.org/alpine/edge/testing sslh;

            echo "creating combined CA + private key PEM file @ /combined.pem"
            cat /etc/tls/tls.crt /etc/tls/tls.key > /etc/tls.combined.pem

            echo "running stunnel + sslh...";
            /usr/bin/stunnel3 \
              -f \
              -p $(TLS_COMBINED_CERT) \
              -d $(LISTEN_ADDR) \
              -l /usr/sbin/sslh -- \
                --inetd \
                --verbose=2 \
                --ssh=$(SSLH_SSH_ADDR) \
                --http=$(SSLH_HTTP_ADDR)            
        ports:
          - containerPort: 8443
        env:
          - name: LISTEN_ADDR
            value: 0.0.0.0:443
          - name: TLS_COMBINED_CERT
            value: /etc/tls.combined.pem
          - name: SSLH_SSH_ADDR
            value: localhost:22 # same-pod containers share localhost
          - name: SSLH_HTTP_ADDR
            value: localhost:8080 # same-pod containers share localhost
        volumeMounts:
          - name: tls
            mountPath: /etc/tls

      #############
      # Workloads #
      #############

      # sshd gives ssh visitors web access if they have the right credentials
      - name: sshd
        # 1.3.0 as of 05/07/2021
        image: panubo/sshd@sha256:b5435f6c5a667ae8c3bd7ea6571612e888515b42d5c78b3c2d0e05bd93bcb365
        imagePullPolicy: IfNotPresent
        command:
          - /bin/ash
          - -c
          - |
            echo "[info] copying files /authorized-users to /etc/authorized_keys";
            cp -r /authorized-users/* /etc/authorized_keys;

            echo "[info] changing permissions on /etc/authorized_keys, /root/.ssh";
            chmod -R 700 /etc/authorized_keys;
            chmod -R 700 /root/.ssh;

            echo "[info] starting sshd...";
            /entry.sh /usr/sbin/sshd -D -e -f /etc/ssh/sshd_config;            
        resources:
          requests:
            cpu: 500m
            memory: 1Gi
          limits:
            cpu: 1000m
            memory: 2Gi
        env:
          - name: SSH_ENABLE_ROOT
            value: "true"
          - name: SSH_ENABLE_PASSWORD_AUTH
            value: "false"
          - name: DISABLE_SFTP
            value: "true"
        volumeMounts:
          - name: ssh-root-authorized-keys
            mountPath: /root/.ssh/authorized_keys
            subPath: authorized_keys
            readOnly: true
          - name: ssh-authorized-users
            mountPath: /authorized-users
            readOnly: true

      # kcup shows web visitors a basic web page
      - name: kcup
        # kcup @ v0.2.0 as of 05/09/2021
        image: registry.gitlab.com/mrman/kcup-rust/cli@sha256:1f34f74670ce0ea711d495669a3a0f262f09a01cebd1179b93b2cf5e68464537
        resources:
          requests:
            cpu: 50m
            memory: 50Mi
          limits:
            cpu: 200m
            memory: 200Mi
        env:
          - name: HOST
            value: "127.0.0.1"
          - name: PORT
            value: "8080" # 8443 for TLS
          - name: FILE
            value: /www/index.html
          # - name: TLS_KEY
          #   value: /etc/tls/tls.key
          # - name: TLS_CERT
          #   value: /etc/tls/tls.crt
        volumeMounts:
          # - name: tls
          #   mountPath: /etc/tls
          - name: www-html
            mountPath: /www

      volumes:
        - name: www-html
          configMap:
            name: www-html
        - name: ssh-authorized-users
          configMap:
            name: ssh-authorized-users
            defaultMode: 0600
        - name: ssh-root-authorized-keys
          configMap:
            name: ssh-root-authorized-keys
            defaultMode: 0600
        - name: tls
          secret:
            secretName: experiments-sshd-tls

References

A few pages that I kept open while I was looking at all this and pieceing it together:

Future improvement

One way that this setup could be massively simplified is actually building sslh into Traefik itself – a plugin that runs sslh right after a connection is made would reduce the toil of getting this to work to zero, pretty quickly.

Before taking this into production, a nice review of the additional attack surface along with how I can monitor the access and properly report on it would be a good idea. Also, a check into the degraded overhead introduced by stunnel and sslh might make sense, but in general the ssh.* prefix probably shouldn’t be being hit that often!

Wrap up

It was fun to explore this idea and get a feel for enabling SSH over HTTPS. Again, the F/OSS world delivers tons of value for free – all you have to do is have some ideas on how to put it all together at this point. sslh is a pretty featureful and stable-looking tool and I’d be surprised to have many problems with it.

Like what you're reading? Get it in your inbox