Awesome FOSS Logo
Discover awesome open source software
Launched 🚀🧑‍🚀

Quick And Dirty Forward-Only DB Migrations

Categories

While working on the The Start (a now defunct project)’s jobs backend, I found myself at the age-old problem of how to migrate my database (which is surprisingly SQLite). I looked at tools like flyway, migrate, but they all kind of seemed like more than I needed. If anything I was leaning towards migrate, but flyway is a very well-known and often used solution that it was also pretty appealing. In the end, I decided to just write a quick makefile target (basically a bash script) for a few reasons:

  • My deploy process (at that point) was Makefile driven, I could gate my deploys on whether migration succeeded
  • The app itself had no idea about the database versioning (I added it to the code proper later, which I’ll probably make antoher post about)
  • I only cared about going forward in database versions
  • All migrations were written idempotently, they could be run multiple times and nothing bad would happen. Basically a lot of CREATE TABLE IF NOT EXISTS and CREATE INDEX IF NOT EXISTS, with the ocassional table-rebuild due to a deep schema change.

All these reasons meant for me that it was better to just spend 30 mins to an hour writing a somewhat-robust migration script than learning and incorporating more powerful tools like flyway and migrate. In the back of my mind I also had the goal of moving DB migration (at the very least database version direction) into the app itself, either as a utility or a startup thing or both.

The idea is pretty simple:

  • Define migrations that work as SQL transactions
  • Name them as some monotonically increasing number sequence (ex. 1.sql, 2.sql, 3.sql)
  • Loop through numbers, find the files, do what you need to do.

Below, I’ll present both my initial forward-only db migration scheme in bash, and then the resulting Haskell implementation that came around and replaced it later.

Makefile target for running local migrations

DB_LATEST_VERSION_LOCAL=6
migrate-db-local:
  @echo -e "Running migrations on local DB..."
  @for i in $(shell seq 1 $(DB_LATEST_VERSION_LOCAL)) ; do \
    echo "Migrating DB to version $$i..." ; \
    cat $(SQL_MIGRATIONS_FOLDER)/$$i.sql | $(SQLITE) $(SQLITE_DB_LOCAL) ||  { echo "ERROR: Migration to version $$i failed" ; exit 1; }; \
  done

A few things to explain here:

DB_LATEST_VERSION - Since I was (at this point) doing all my deploys with an Makefile+SSH driven approach, the database version that needed to be migrated to had to be kept in the makefile itself (along with the actual app version to deploy). Generally, the DB_LATEST_VERSION would change, but only be deployed locally (local and remote PROD deployments are separate), until the changes were ready to go to PROD.

Makefile target for running remote migrations (PROD)

DB_LATEST_VERSION_PROD=6
migrate-db-prod:
  @echo -e "Running migrations on remote host..."
  @for i in $(shell seq 0 $(DB_LATEST_VERSION_PROD)) ; do \
    echo "Migrating DB to version $$i..." ; \
    cat $(SQL_MIGRATIONS_FOLDER)/$$i.sql | ssh $(SSH_ADDR) "$(SQLITE) $(SQLITE_DB_PROD)" ||  { echo "ERROR: Migration to version $$i failed" ; exit 1; }; \
  done

As you can see, there isn’t much of a difference between the local migration process and the remote migrations – the only real difference is feeding the commands through SSH.

How long was it like this?

Well the code wasn’t like this for very long, about one release (which is embarassingly about 1 month long for me), and served me well while I was using it which is why I wanted to share.

While having the app itself handle database migration is probably a controversial issue, at the very least your app should be able to check that the database is at the version it expects, and make an smart decision as to what to do (run in some sort of reduced capability mode, or error, etc). This code was removed after implementing both utility driven (ex. $ app RunDBMigrations is a utility I can run from the command line now) and runtime checking/migrating was added to the code base. The haskell code that makes the migrations happen at runtime (or when the RunDBMigrations task is run) is pretty similar:

-- ... lots of code ...
getMigrationSQL :: BackendConfig -> BackendVersion -> IO DT.Text
getMigrationSQL cfg = fmap DT.pack . readFile . (dbMigrationsPath cfg</>) . (++".sql") . show

doMigration_ :: BackendConfig -> S.Connection -> BackendVersion -> IO ()
doMigration_ cfg c version = getMigrationSQL cfg version
                             >>= Database.SQLite3.exec (S.connectionHandle c)

getBackendVersion_ :: S.Connection -> IO BackendVersion
getBackendVersion_ c = S.query_ c DBQ.getDBVersion
                       >>= \r -> return $ extractCount r

-- ... different helper functions ...

instance Backend SqliteBackend where

-- ... other typeclass method implementations ...

getCurrentVersion :: b -> IO BackendVersion
getCurrentVersion = maybe (error "Failed to get version, database disconnected") getBackendVersion_ . backendConn

migrateToVersion :: BackendVersion -> b -> IO (Maybe b)
migrateToVersion target b = maybe (return Nothing) handle (backendConn b)
    where
      cfg = backendCfg b
      logMigrationAttempt v = logMsg b INFO ("Attempting migration to DB version [" ++ show v ++ "]...")
      handle c = getCurrentVersion b
                 >>= \v -> logMsg b INFO ("DB Currently at version [" ++ show v ++ "], migrating to version [" ++ show target ++ "]")
                 >> forM_ [v+1..target] (\nextVersion -> logMigrationAttempt nextVersion >> doMigration_ cfg c nextVersion)
                 >> return (Just b)

-- ... lots more code ...

With that code, when the backend starts up it checks the database version and tried to migrate to a certain given version (passed in by config, loaded from various sources).