Static Binaries for Haskell: A Convoluted Approach

A convoluted (but working) path to building binaries for a Haskell app

vados

17 minute read

Static Binaries for a Haskell: A convoluted approach

NOTE - this blog post has been updated, thanks to feedback from the reddit community in pointing out ways this should have been done/done better! If you read any part of this with any intent to use it, definitely read the updates at the end!

tl;dr - After a bunch of trial and error, I end up building a mostly static binary from a docker container. With hindsight it was only “mostly” static because after trying to get sendmail working from haskell code, the getProtocolByName system call was failing, pointing to the fact that there were a bunch of libraries NOT included in the executable I thought was fully static (GHC warned me) that needed to be present in the same form in the deployment container. This prompted making sure that the container that built the binary had the same underlying OS (ubuntu vs alpine) as the deployment container. Check out the Dockerfiles towards the end to see what things looked like for me.

As explored in previous blog posts, my current deployment strategy depends heavily on containers, in particular docker containers. As containers simplify deployment for me, simplifying deployment to a container is often done by picking a language that will produce static binaries (or installing plugins/libraries that will build binaries). The ease of interpreted languages often results in a tradeoff at runtime – you need to bring the interpreter, along with all the libraries that your program needs to the system that’s going to run your program. This is often not a big deal – interpreted languages are still a great choice for developing and deploying applications, but deployments can become more complicated than necessary, and make your containers more brittle, as many of the package installation commands make your container build more brittle.

If you’re lucky enough to have chosen a language like Golang which has excellent support for creating statically linked binaries, then you’re in luck, because that means you can create very minimal containers (you can even use the scratch base image), and do nothing but copy the built executable into it to get an app ready for production.

Today’s post isn’t about how easy it is to do this with Go, however, it’s actually about how easy it was (or wasn’t, I’ll let you decide) to do this with Haskell. This is certainly NOT an introductory text to Haskell, reading Learn You A Haskell For Great Good would be the better text to read if you’re looking for an introduction than anything on this blog.

Figuring out what I was doing wrong, and how to build static executables with Haskell was a lot of wandering in the dark, but I was greeted with success at the end of the road (and with more issues, as these things often go). Here’s what happened:

The beginning – trying to RTFM

It should have been relatively simple to build a static binary – optl-static is an option that Haskell’s compiler provides, and if it’s as easy as it sounds, I should just need to tack it on somewhere (like in the .cabal file?). Of course, things aren’t simple, so I ended up spending a bunch of time seeing how other people have tackled this. Here are a few links that I found useful:

Turns out the fact that I’m running Arch Linux made things a little more complicated because it doesn’t download the sources or development versions for various libraries that are used in bulids (it just keeps the binaries)…

There’s lots of information to take in on these links, but it shows that there are other people who have thought of this (and found solutions), so it was pretty encouraging to find these. My key findings when finding out what was necessary to build a static executable:

  1. crtBeginT.so hack needs to be performed (some files need to be switched around)
  2. Various libraries need to be properly staticaly built (ex. gmp)

The second point is kind of obvious (since we are trying to build a completely static binary), but maybe someone who overlooked it may find it useful.

Starting down the rabbit hole

After reading all the above links, I figure I know how to get started, and begin to go down the rabit hole of building a bunch of the libraries on my own machine from scratch. It basically went like this:

  1. Retrieve gmp package from ABS (ABS=Arch Build System)
    • Download/start using asp, which makes it easier to get packages for Arch
    • asp export gmp makes a folder called gmp in the current path, with the package build in it
  2. Edit the package build and install the package, which consists of:
    • Adding a line that says options=(staticlibs)
    • Running makepkg && makepkg -i
    • Ran into an issue with the keys not being trusted, had to run gpg --recv-keys F3599FF828C67298, seems to belong to “Niels Möller nisse@lysator.liu.se” which sounds right…?
  3. Realize I should stop. It doesn’t make sense to make all these changes to my system just to build one project – I have a tool in my toolbox that can help here – Docker.

This is one of the great things about docker/containers in general – you can use them for dirty jobs like this where you don’t want to dirty your own system. Some developers even go as far as dockerizing their whole development environment (I do not), using a containerized IDE/editor along with other resources. Dockerizing the build step of an application is pretty tame in 2017 container land – stack even has an option to build in docker built into it.

NOTE I willingly chose to trade one form of what I consider to be accidental complexity here for another – rather than having the next developer that touches the codebase with a fresh computer spend 30 minutes doing setup of the static libraries before they can do a build, I’ve now mandated that they learn enough to become dangerous with docker before they start. I think the latter is the better choice in this case – I can do my best to make the docker process pain-free and all they have to have is docker properly insalled. I can’t confidently say the same for making sure that background libraries get built and GHC gets configured properly.

Dockerizing the rabbit hole

So, it’s time to experiment (inside a docker container) and figure out what I need to do to get the app to compile statically. Here’s how I started my experimentation:

  1. docker run -it alpine /bin/bash
  2. docker cp api/ <generated container name>:/var/app/code
  3. Do the crtBeginS.so/crtBeginT.so hack
  4. # cd /var/app/code (inside the container)
  5. stack build --ghc-options='-optl-static -optl-pthread' --force-dirty (this command takes a while, package indices get updates, etc)

After getting to step #4 a few times, I started to wonder if there was any way to cut down on the amount of time it took – stack was spending a lot of time installing the 8.15 resolver, and that seemed thoroughly unnecessary – FP Completeis pretty down with docker, and I remembered that they do havea container that comes with everything for the 8.15 resolver pre-installed! Enter fpco/stack-build:lts-8.15.

So, I make a Dockerfile that starts from that stack-build container, and get to work again – even quicker, even dirtier:

  1. Start from fpco/stack-build:lts-8.15 container (command should be something like docker run -it fpco/stack-build:lts-8.15 /bin/bash)
  2. docker cp api/ <generated container name>:/root
  3. stack install --local-bin-path /root/ --ghc-options='-optl-static -optl-pthread' --force-dirty
  4. Slightly shorter package/cache retrieval/update time
  5. Find out that the right ghc isn’t installed so I needed to run stack setup

At this point I’m starting to think I’m getting somewhere so I should start checking things into the repo, so I move to a “real” Dockerfile and start codifying the steps I’m taking (basically adding lots of RUN commands)..

  1. Add lines to do the crtbeginS/crtbeginT hack
  2. SUCCESS – the app builds with those options, and produces a static binary.

Success! But crazy long build time :(

The naive Dockerfile did all the steps I figured out during testing in the same orderorder, and they turn out to take a VERY long time for some reason. The steps look to be:

  1. Downloading 8.15 build plan (the LTS resolver I’m using)
  2. Updating package index Hackage (mirrored at https://github.com/commercialhaskell/all-cabal-hashes.git) …
  3. Download GHC 8.0.2 and installing it
  4. THEN finally getting to making my changes

Here’s what the output looked like:

Downloading lts-8.15 build plan ...
Downloaded lts-8.15 build plan.
Updating package index Hackage (mirrored at https://github.com/commercialhaskell/all-cabal-hashes.git) ...

Fetching package index ...
Fetched package index.
Populating index cache ...
Populated index cache.
Preparing to install GHC to an isolated location.
This will not interfere with any system-level installation.
Preparing to download ghc-8.0.2 ...
ghc-8.0.2: download has begun
ghc-8.0.2:   66.53 KiB / 107.55 MiB (  0.06%) downloaded...
ghc-8.0.2:  134.53 KiB / 107.55 MiB (  0.12%) downloaded...
ghc-8.0.2:  236.53 KiB / 107.55 MiB (  0.21%) downloaded...
ghc-8.0.2:  372.53 KiB / 107.55 MiB (  0.34%) downloaded...
ghc-8.0.2:  763.53 KiB / 107.55 MiB (  0.69%) downloaded...
ghc-8.0.2:    1.63 MiB / 107.55 MiB (  1.51%) downloaded...
ghc-8.0.2:    3.02 MiB / 107.55 MiB (  2.81%) downloaded...
ghc-8.0.2:    4.98 MiB / 107.55 MiB (  4.63%) downloaded...
ghc-8.0.2:    7.59 MiB / 107.55 MiB (  7.05%) downloaded...
ghc-8.0.2:   11.40 MiB / 107.55 MiB ( 10.60%) downloaded...
ghc-8.0.2:   16.29 MiB / 107.55 MiB ( 15.14%) downloaded...
ghc-8.0.2:   22.52 MiB / 107.55 MiB ( 20.94%) downloaded...
ghc-8.0.2:   29.69 MiB / 107.55 MiB ( 27.61%) downloaded...
ghc-8.0.2:   36.85 MiB / 107.55 MiB ( 34.26%) downloaded...
ghc-8.0.2:   44.41 MiB / 107.55 MiB ( 41.30%) downloaded...
ghc-8.0.2:   52.26 MiB / 107.55 MiB ( 48.59%) downloaded...
ghc-8.0.2:   58.27 MiB / 107.55 MiB ( 54.18%) downloaded...
ghc-8.0.2:   64.04 MiB / 107.55 MiB ( 59.54%) downloaded...
ghc-8.0.2:   69.93 MiB / 107.55 MiB ( 65.02%) downloaded...
ghc-8.0.2:   75.87 MiB / 107.55 MiB ( 70.54%) downloaded...
ghc-8.0.2:   81.96 MiB / 107.55 MiB ( 76.21%) downloaded...
ghc-8.0.2:   87.40 MiB / 107.55 MiB ( 81.26%) downloaded...
ghc-8.0.2:   93.07 MiB / 107.55 MiB ( 86.54%) downloaded...
ghc-8.0.2:   98.43 MiB / 107.55 MiB ( 91.52%) downloaded...
ghc-8.0.2:  102.49 MiB / 107.55 MiB ( 95.30%) downloaded...
ghc-8.0.2:  106.82 MiB / 107.55 MiB ( 99.32%) downloaded...
ghc-8.0.2:  107.55 MiB / 107.55 MiB (100.00%) downloaded...
Downloaded ghc-8.0.2.
Unpacking GHC into /root/.stack/programs/x86_64-linux/ghc-8.0.2.temp/ ...
Configuring GHC ...
Installing GHC ...

This is obviously not what I want to happen (and take forever) every time that I try to build the app in a container, so I next set out on a way to reduce the build time. The simplest way I could think of to reduce the build time was to use two containers to build – using a second container starting from this point, installing the things necessary for the app. This means I have essentially 2 build containers, one for stack, the resolver, and ghc, and another built ON TOP of that one that actually builds the app.

This is a lot of complexity I chose to take on, and thinking about it at the time even I could see that I was doing a poor man’s imitation of a union file system. Unfortunately, I don’t know enough about how to control caching of containers at individual commands (or if it’s as easy as I think it is), so this approach was easier to implement and more straight forward to understand, IMO. Adding this second build container did drastically speed up my build times, as anyone would expect, so I chose to live with the complexity for now and hid it behind some nice ergonomic Make commands.

Zoom to the future, where everything breaks, and hindsight is 2020

This is about where I would normally just post the code for you to use, but I’m going to go ahead and hold off. Why? the fpco/stack-build based container certainly works, and builds the static executable, but there’s one flaw that I didn’t actually notice till much much later.

Building a static executable is actually problematic because there are a bunch of libraries that GHC will NOT include for you, and just throw warnings about. One of those many libraries is tied to network functions, like getProtocolByName (which comes up later). I realized this after implementing an email sending feature that never worked, because sendmail was failing inside the deployment container.

It was pretty bewildering trying to track down the problem – I started by opening up the container, running sendmail from the command line and making sure it worked (it did), and I ended up going all the way back to the build output of the process in this post before finding the culprit – the build process in this post. As I stated, various libraries actually DON’T get built statically and GHC warns you that it’s going to have to find the same library on the runtime system that it found during buildtime…. Guess what didn’t happen? Yes, you got it, the exact thing GHC warned me about.

tldr; Basically, the build container and deployment container must have the same base container/OS, with similar compiled versions of those libraries.

Feel free to skip the following few sections as they delve into the issue and my thought process while fixing it.

How in the world did I break sendmail, but only from within Haskell code?

The api sends emails, by using sendmail thanks to Network.Mail.SMTP and Network.Mail.MIME. As previously stated, the issue was that sendmail would work from inside the container, but not from the Haskell code directly.

The mystery of the broken getProtcolByName

The error that sendmail was throwing from inside Haskell was very weird/unexpected:

Failed to connect SMTPMailer. Error: `getProtocolByName`: does not exist (no such protocol name: tcp)

The application just kept running because it’s currently OK to have the SMTPMailer exist with an unconnectred SMTP server, but this is what I wasn’t expecting – why was it unconnected?

Sanity test: get into the running container and do manual sendmail

  • Since inside a container you can only access the host by using the the IP the docker network gives it, check ip route on the host machine (in Makefile scripts I use the command ip route | grep docker0 | awk '{print $$9}' to identify the IP of the docker host)
  • Get into the running app continer with docker exec -it <running container> /bin/sh (I use sh here instead of bash because the container is alpine based)
  • Run sendmail manually by running sendmail <address>. (from there you’ll have to write in the pieces of the email), looks something like:
/ # sendmail -S <docker host IP> <recipient email>
subject: This is a test
from: <from email>
This is a test
<CTRL-D to stop sendmail from letting you write more lines>

In case you didn’t know: CTRL-D is really cool, it’s the end-of-line character, and you can do it from a keyboard!

After doing the sanity test, and seeing that emails indeed work from inside the container manually, the question then was why wasn’t my program (running in the exact same container) able to do it?

Sanity test #2: Check connections

Next test is making sure that the connection it was trying to make was to the right place. Luckily I was smart enough to print all the config for the app when it starts up, all I had to find were the following lines:

MailerConfig:
    MAILER_HOSTNAME...............172.17.0.1
    MAILER_PORT_NUMBER............25

That’s right, so it looks like the problem is elsewhere… time to take one more look at the error:

Failed to connect SMTPMailer. Error: getProtocolByName: does not exist (no such protocol name: tcp)

Google to the rescue (almost) - https://github.com/bos/wreq/issues/5 - https://github.com/fpco/haskell-scratch/issues/2

I stumbled upon these issues and while they’re not exactly what I’m facing, they kind of point in the right direction. Unlike the people in those issues, I DO have /etc/protocols and /etc/services and the rest, and they look correct. Here’s what I think might be wrong at this point:

  1. /etc/ssl is missing… If this is indeed an SSL certs issue (despite the fact that I’m not using anything over HTTPS), then maybe that needs to be there?
    • Try to install w/ apk --no-cache add ca-certificates && update-ca-certificates while inside container
    • /etc/ssl exists now
    • Run a different server w/ command PORT=5555 /var/app/bin/api-static RunServer, get same error
  2. netbase package in Ubuntu seems to fix this problem… I wonder if alpine has something similar? I don’t think this is the issue because all the folders are there…
  3. There are a bunch of warning from static compilling over dlopening a bunch of files, one of which is getaddrinfo… Since I’m not getting some weird segfault, and the error says what it says, it’s probably not this? (DING DING DING, it’s this, the thing I thought it wasn’t)

I’m going to spare you the notes where I dive into the rabbit hole to try and disprove 1 and 2. The problem turns out to be 3 – that the warnings from static compilation were affecting me, primarily because I built the executable in fpco/stack-build, which is based on ubuntu deep down, whereas I run my container with alpine linux!.

The completed Dockerfiles

The result of all this work is two Dockerfiles that build the application. The first is a lower level build, which covers GHC, Stack, and all the resolver libraries, and the next is the actual app install. Check it out:

builder-base container (the “first” container which builds Stack, GHC, etc)

FROM alpine
LABEL description="Builds API static binary"

# Note this file is meant to be run from the TOP LEVEL `api` directory using `-f` option to docker
COPY api/ /root/api

WORKDIR /root/api

# update repositories (add edge main & community)
RUN echo "https://dl-3.alpinelinux.org/alpine/edge/main" >> /etc/apk/repositories
RUN echo "https://dl-3.alpinelinux.org/alpine/edge/community" >> /etc/apk/repositories

# install ghc + cabal, sources
RUN apk update
RUN apk upgrade
RUN apk add alpine-sdk git linux-headers ca-certificates gmp-dev zlib-dev ghc cabal curl

# install stack
RUN curl -sSL https://get.haskellstack.org/ | sh

# build app
RUN stack config set system-ghc --global true # tell stack to use the global ghc, installing GHC with stack fails
RUN stack install --local-bin-path /root/api/target --ghc-options='-optl-static -optl-pthread' --force-dirty

builder container (the “second” container which builds the actual app)

# Note this file is meant to be run from the TOP LEVEL `api` directory using `-f` option to docker
FROM builder-base

LABEL description="Builds API static binary (with the help of builder-base)"

# Copy fresh code, likely updated version of what was used on the base builder
COPY . /root
WORKDIR /root/api

# (re-)install
RUN stack install --local-bin-path /root/api/target --ghc-options='-optl-static -optl-pthread' --force-dirty

Please take this code with a grain of salt, and think about it critically before using. I am almost 100% sure that there is a better way to do the caching (without using two different Dockerfiles), and am fairly certain I’m just doing it wrong in some way or another.

Updates

After posting this to r/haskell, I’ve been lucky enough to get some great feedback about what I’ve been doing wrong – here’s how to make this process better:

Use multi-stage docker build support

Thanks to a comment from reddit user taylorfausak, I now know the better solution for this multi-stage build is… multi-stage builds with Docker! All the good caching and small image sizes without the hassle of keeping/maintaining multiple containers! We did it reddit (thanks to taylorfausak, and of course the devs over at docker/moby? HQ)

Use the proper cabal config (no need for the crtBeginT.so hack)

Reddit user erebe pointed out that the way I’m doing it is actually wrong, -optl-static is not the way. Turns out building the static binary is as simple as:

  1. Adding ld-options: static to your cabal file
  2. Mounting the project directory into the container
  3. Running stack clean; stack install --split-objs --ghc-options="-fPIC -fllvm"

More information about it can be in a related reddit thread

Switching to these instructions DOES work for me and produces a static executable, with much less work (though in my solution I actually didn’t need the crtBegin.so hack anymore).

UPDATE - While this method will work just fine from inside the alpine container that builds the project, local builds (like running stack build) on Arch Linux wll break, due to the lack of staticly built versions of gmp. To fix this, you can either install the static version of gmp with the instructions above, or just disable/enable the new process only when inside the build container. I went the route of pulling down the sources for gmp, and installing the static libraries locally – as described earlier in this guide. After installing the static version of gmp, stack build commands are now inundated with warnings about the shared libraries in glibc which are described earlier in this post, which I’ve promptly ignored.

Did you find this read beneficial? Send me questions/comments/clarifciations.
Want my expertise on your team/project? Send me interesting opportunities!