tl;dr - There are lots of ways to get smarter about how you deploy. Ansible is one choice, it’s not the fanciest, but it’s amazing.
NOTE This is not an introduction to ansible, please check out their official documentation if you want that.
For most of my projects, I use a GNU Make Makefile
based build process. I do that because it’s cross-platform, pretty well suppported/known (for people who build software), and easy to standardize on no matter what project I’m working on. For example, if I’m working on a JS project, and have a bunch of grunt tasks, I proxy the real important top-level ones through make
targets, so that I have a unified flow like make build
that matches another project that might be in some other language (which doesn’t have grunt). It’s another layer of abstraction much of the time (as people love to build tools for their favorite languages), but for me it’s worth it.
Over the last year, I’ve boarded the container hype train (All the innovation enabled by LXC: Docker, rkt, etc), and am using containers to deploy my applications. This hype train was indeed worth getting on, as I’ve found that as more and more languages include great libraries to run servers of various kinds, reducing my deployments to a light virtual machine that just runs a server I wrote for the program is getting easier and easier. Switching to container-driven deployment processes have saved me from having to wory about putting my source code on the server, setting up dependencies, and doing a lot of other tedious work that’s differed from language to language (i.e. pyenv
/virtualenv
/bundler
/npm
/go get
/etc). Now I just put a container on the server that runs a program that listens on port 5000 (or whatever), and as long as some other program on the server is listening on port 80 and forwarding traffic to port 5000, I’m good. The simplest instance of this means I just run NGINX, tell it that there’s something at localhost:5000
that it should proxy to, and I’m off to the races.
I currently use Docker as my container runtime of choice because I tried rkt
once (twice? maybe once), and couldn’t get the hello world example to run. There are features of rkt
that make it attractive to me, but I didn’t (and still don’t) have the patience to work with software that makes me scratch my head at “Hello World” unless it’s the only option. Since I already had experience with docker and it’s got some very good ergonomics, I stuck with it. I’m definitely glad that there’s an alternative out there though, should help to keep the Docker crew honest/on their toes and innovating. Maybe I’ll try rkt
again in the future.
Things I ran into while boarding the hype train
Don’t use devicemapper
if you have any other option. One of my servers runs ArchLinux and I was able to switch to the overlay
file system, and my life managing docker containers became so much better.
Union filesystems are super cool. If I were to give you a bad, probably wrong explanation, they basically express your hard-drive state as a series of deltas build on top of one another. If you start at state 0 with an empty hard-drive, and create a file (let’s say “README.txt”, a new state is built (let’s call it state 1), which says “take state 0, and add README.txt”. From an efficiency point of view there are some drawbacks, but the deduplication features and the ability to do things like going back in time (just going to a previous state of your hard-drive) are amazing.
Union based file systems are also one of the only good answers to those crypto locker schemes that are running around the internet today – If someone tried to encrypt your harddrive, and they started yesterday, theoretically all you’d have to do is roll back the contents of your hard-drive! Obviously if the attacker compromised your ability to rollback or messed up the harddrive management code in some way then you couldn’t, but just as read-only access and hardware-protections are options to mitigate. Anyway, this article isn’t even about filesystems so I’ll stop there.
In my opinion, If you’re doing things right in the software world in 2017, your application should be deployed in the span of ONE button press, or shell command. For me, on many of my projects right now, that “one step” looks like this:
make build deploy SSH_ADDR=<server ip>
As previously described, I use make
a lot across my projects, and I generally try to make this line do all necessary things that builds the application, and deploys it, using whatever means necessary. Here’s what this looks like for this blog itself:
Dockerfile
stored in the repo, build a docker image based on NGINX with the static site contentsI should probaby go into how exactly I do this in another post (bits are sprinkled around this blog already) but that’s talk for another day.
Either way, the steps above can very easily differ from project to project, so it’s really important to me to be able to have a command that just does what needs to be done, end of story (unless something goes wrong of course).
If you go out and get a VPS (I use a company called INIZ), or get a super cheap dedicated server like I did recently, There’s a bunch of things you need to do to get that server to a state where it can be used for running applications (and other services, like email) in production in a reasonably secure/responsible manner. Here are just some of the things:
root
in favor of SSH keyssudoers
apt-get
or pacman
)docker
Whenever there is a thing to do, there is more than likely a hierarchy of how to do it intelligently. I refer to this as “operational intelligence”. If we take “setting up a server” to be the thing to do, here’s are some of the different levels of the hierarchy of operational intelligence I’ve witnessed (from what I’ve seen at companies to what I do in my own projects, to what’s out there that I know of):
systemd
that you just can use without thinking aboutNOTE: Getting to the level of let the cloud do it starts to bring about this really interesting concept of treating your services/servers as cattle NOT pets.
So with this knowledge of what it takes, and what’s out there, I started to research my options for a project I’ve been working on (and my projects going forward). Since this is basically a chance to rethink how I deploy, I started to look at how I could get as close to the “servers as cattle” mantra as possible, because it seems like the way forward.
I started mulling over a few things:
Fabric/Ansible - Mentioned above, I actually had a bunch of experience with Fabric (and last time I used it I was at the point where I started imagining how to extend it and build a bigger provisioning tool out of it) at the time of making this decision, but was a big question mark.
NixOS - “Nix” can refer to a bunch of things – an OS, a package manager, and a programming language – but it’s based on reproducability, and some of that cool union filesystem stuff (and bunch of other cool concepts), and seems to be just about the pinnacle of if-it-works-once-it-works-forever life.
Terraform - I covered it a little bit before, but I really like the idea of a meta tool that manages my cloud for me.
In the end, I went with Ansible, for a few reasons:
Reason #1: I didn’t want to spend the next week/few weeks understanding and learning to properly use Nix. I wanted to get up and running quickly, and unfortunately that meant trading the features and benefits of Nix for something a little closer to what I was familiar with. While it’s spoiled of me to expect/require this, I found it harder to get started with Nix than spending 10 mins doing a “hello world”. That’s often the level of patience I have with relatively-mature open source projects these days. If you want me to get on your hype train (it’s fine if you don’t, you probably don’t want too many band-wagoners on your train), shorten the on ramp. I still firmly believe that I’ll be revisiting Nix or something like it in the future, because it really seems like the way forward – determinism is one of the sexiest things in computing… That feeling of something working, then working the same every time is amazing.
Reason #2: Python’s pretty good for scripting/getting dirty, when you have to. I no longer expect anything to go smoothly, so I prioritize solutions that degrade in the user-failure case to something that’s full-featured, powerful, and that I likely already understand. Consider how using AWS degrades - If something goes wrong while you’re working with AWS, you can:
That’s just an example of what I mean by considering how some tool degrades in user-failure cases. Using that analogy, I like how ansible degrades, because I have roughly two steps:
Reason #3: It’s got more stuff built in than Fabric. For example, Managing systemd was/is a big part of my management flow – and ansible has great support for managing systemd.
Reason #4: It doesn’t require a base-server, but has the option if you need it. I’m a super small operation, I only have one server to manage (there’s actually three but I’m trying to downsize since I got the nice big dedicated machine now) – I like that ansible scales up if I need it to, but starts small and simple, with one server.
After all this text, we finally start getting to the real content – I found Ansible to be VERY enjoyable to use. After casually reading through the awesome documentation, I started trying to get it set up on my servers, which required reaching out to a few other sources:
These articles helped me solidify my knowledge of the ansible, in particular the directory structure and what the files were supposed to look like. Once I became comfortable with those aspects of the workflow, my productivity skyrocketed. Here are just a few reasons why I love it:
daemon-reload
systemd? ez pz, add a daemon_reload: yes
to some other command and it’ll get done.Of course no tool is without it’s warts, and here is one thing I found while using ansible I was less than excited about:
A Playbook is a recipe for transforming a piece of inventory into the state you want it to be in. “The state you want it to be in” is defined by the playbook by specifiying Roles you want that piece of inventory to play. Roles have tasks assigned to them, which are what needs to happen for the server to play the role (i.e., it should have nginx installed, it should have the web-app files, etc) Templates can be used in tasks (but don’t have to be).
I was so happy using ansible that I wanted to make this blog post. Now that I’ve spilled those happy feelings all over the internet, I’ll leave you with a small (old, so pardon if it’s missing some key conventions) configuration I wrote that showcases how simple it was for me:
---
- name: login to the docker registry that contains the webapp container
shell: docker login -u gitlab-ci-token -p "{{registry_access_token}}" "{{registry}}"
- name: get version {{webapp_version}} of the the-start-webapp container
docker_image:
name: "{{webapp_container_name}}"
tag: "{{webapp_version}}"
state: present
- name: add webapp systemd service
become: true
template:
src: webapp.service.j2
dest: /etc/systemd/system/webapp.service
- name: start & enable the webapp systemd service
become: true
systemd:
name: webapp
state: started
enabled: yes
daemon_reload: yes
As you can read, this short list of tasks (which was stored @ roles/webapp/tasks/main.yaml
in my infrastructure-management repo), does everything necessary to get a web app started, leveraging systemd
for starting and management later. I wrote this in minutes with a little bit of alt-tabbing back and forth to the documentation, and it just worked. That was huge for me.
Of course, when I logged into the server to confirm (trust but verify), I found an error, but it was actually my fault! the registry name was invalid. I actually believed in ansible enough that I took down the running production container and re-ran the ansible task to put it back after fixing it. Of course, the application isn’t highly used, and in general you don’t want downtime but I was feeling particularly scrappy at that moment so I did it (no ragrets). There’s a bit more I could say about it, but I think that shows just how much I now trust this tool in my toolbox.