Running containers for dependencies at will on Gitlab CI

Categories
GitLab logo + Docker logo

tl;dr - During E2E testing I spin up containers of dependencies to run tests against them – the setup I normally use recently stopped working so I fixed it.

E2E tests are the most important tests for a project – it doesn’t matter if no endpoints malfunction due to invalid input if no users can checkout or perform crucial functions with good inputs. For past projects, I’ve used containerization (docker) containers in E2E tests in CI to test systems I’ve built as faithfully as possible to the actual conditions they run under. Recently I’ve discovered that the code that I previously used did not work as expected anymore – tests would start, run docker run, but were unable to access the supposedly successfully started containers. The changes were covered in the release notes, but I didn’t see the particular notes and was still unable to get the configuration in there to work for me.

I actually made a thread on r/gitlab about this seeming change/code rot, and no one knew the answer. A suggestion was made to just use GitLab services), which are another fantastic feature that GitLab CI offers, but I didn’t want to have to re-use instances (I was working on redis-bootleg-backup at the time), I wanted to spin up fresh instances at my own pleasure, just like I do on my local machine.

Today while working on a new project and adopting the paradigm to make sure my E2E tests were running, I ended up finding the full solution to the misconfiguration I was running into, the fix consisted of:

  • Specifying DOCKER_HOST (to tcp://docker:2375)
  • Specifying DOCKER_TLS_CERTDIR (to '' to disable or '/certs', if you enable TLS, you must use port 2376)
  • (optional) specify DOCKER_TLS_VERIFY && DOCKER_CERT_PATH to enable DOCKER_TLS_CERTDIR setting
  • Changing the port binding to use 0.0.0.0:<port> for the external port
  • Referring to created containers with the docker hostname

It turns out that the docker image does a little bit of extra work to figure out and set these settings for you which is why things work so easily without it. If you use an alternative image that just happens to have the docker CLI tool installed, you’ll have to do some setting yourself. I use a modified builder image so that is what was making things more difficult.

To explain what’s happening here as clearly as I can:

  • Container that will run the test (we’ll call this test-container) is set up by GitLab CI
  • While the test-container runs, when a command like docker run is run, it uses the DOCKER_HOST (the official docker images do some checks to check what the DOCKER_HOST should be)
  • Setting DOCKER_TLS_CERTDIR lets docker know where to make/find the necessary certs
  • When your docker run ... command runs and starts the port-mapped container, the container does not exist in test-container but instead exists next to the remote daemon, this means when you try to access the container by the port you exposed, the container isn’t at localhost/127.0.0.1, it’s at the docker (i.e. the docker-in-docker service) hostname instead (listening on all ports because you set 0.0.0.0 to be the host.

This took me a while to figure out, but after a bunch of failed tests in the pipeline I was finally able to geet it. While working on redis-bootleg-backup I actually just bundled the dependencies (downloading the redis-server binary itself into the builder image I use), but it’s nice to be able to not worry about doing that for postgres, and being able to use docker as usual.

The code

First it start with an excerpt of the gitlab-ci.yml file:

# ... other yaml code ...

e2e:
  stage: test
  only:
    - /release-v[0-9|\.]+/
    - merge_requests
  services:
    - docker:dind
  variables:
    DOCKER_HOST: tcp://docker:2376 # TLS enabled (2375 for disabled)
    DOCKER_TLS_CERTDIR: '/certs'
    DOCKER_TLS_VERIFY: 1
    DOCKER_CERT_PATH: '/certs/client'
  script:
    # docker's default is alpine-based
    - make build test-e2e-parallel

# ... other yaml code ...

Here’s what some of the code that checked for the CI environment looked like:

const rndm = require("rndm");

/**
 * Create connection details for a random test database
 *
 * @returns {Promise<DBConfig>}
 */
export function generateTestDBConfig(): Promise<DBConfig> {
  let database = `test_db_${rndm(6).toLocaleLowerCase()}`;

  if (process.env.USE_LOCAL_DB_FOR_TEST) {
    database = "your_db";
  }

  const config = new DBConfig({
    database,
    host: process.env.DB_HOST || "127.0.0.1",
    port: 5432,
    connectionName: database,
  });


  // If not in CI we'll use a randomized port to run locally to avoid collisions
  return getPort()
    .then(port => {
      const updated = Object.assign(config, {port});

      // If we're in CI we need to use a different host for the likely remote
      // docker builder that `docker run` got executed on
      if (process.env.CI) {
        updated.host = 'docker';
      }

      return updated;
    });
}

The only real change here is where we use a different hostname – the randomized database name is all fine (the docker run command is run later with this configuration, so the container will have the right DB set up). Here’s some code from the later bit where we do the port bindings:

/**
 * Tape Test that starts a postgres instance to use in suites
 *
 * @param {tape} test - the tape top level export
 * @param {DBConfig} config - configuration details that will be used to make the pg instance
 * @param {Function} cb - Callback that receives the container and options
 * @param {object} containerEnv - Environment to use within the container
 * @param {string} tagName - image tag
 */
export function startContainerizedPostgresTest(
  test: any,
  cb: (cao: ContainerAndOptsWith<DBConfig>) => void,
  opts?: {
    dbConfig?: DBConfig,
    containerEnv?: object,
    tagName?: string,
  },
) {
  const tagName: string = opts && opts.tagName ? opts.tagName : POSTGRES_IMAGE_TAG;
  const envBinding: {[key: string]: string} = Object.assign(
    {},
    POSTGRES_CONTAINER_DEFAULT_ENV,
    opts && opts.containerEnv ? opts.containerEnv : {},
  );

  let buildConfig: () => Promise<DBConfig> = generateTestDBConfig;
  if (opts && opts.dbConfig) {
    buildConfig = () => {
      if (opts && opts.dbConfig) {
        return Promise.resolve(opts.dbConfig);
      }
      return Promise.reject(new Error("dbConfig wasn't provided but it was earlier"));
    }
  }

  // Will be created later
  let builtConfig: DBConfig;

  test("Starting postgres instance", (t: Test) => {
    let containerAndOpts: ContainerAndOptsWith<DBConfig>;

    buildConfig()
      .then(dbConfig => {
        // The DBConfig may have different values than postgres container defaults
        // so we must update env bindings accordingly
        // tslint:disable-next-line: no-string-literal
        if (dbConfig.database) { envBinding["POSTGRES_DB"] = dbConfig.database; }
        // tslint:disable-next-line: no-string-literal
        if (dbConfig.username) { envBinding["POSTGRES_USER"] = dbConfig.username; }
        // tslint:disable-next-line: no-string-literal
        if (dbConfig.password) { envBinding["POSTGRES_PASSWORD"] = dbConfig.password; }

        builtConfig = dbConfig;
      })
    // Start the container
      .then(() => {
        if (!builtConfig.port) { throw new Error("built DBConfig does not contain a port"); }

        let portBinding: PortBinding = {5432: builtConfig.port};

        // If we're in CI, we'll need to attempt to make the instance exposed on a
        // possibly remote docker daemon
        if (process.env.CI) {
          portBinding = {5432: `0.0.0.0:${builtConfig.port}`};
        }

        return startContainer(t, {
          imageName: POSTGRES_IMAGE_NAME,
          tagName,
          portBinding,
          envBinding,
          startTimeoutMs: 50000,
          // Postgres actually puts out the *final* ready to start message
          // after doing setup on stderr, *not* stdout
          waitFor: {stderr: POSTGRES_STARTUP_MESSAGE},
        });
      })
      // ... more code ...

The important bit here is that if we’re in CI, we need to use the 0.0.0.0 host, and we bind to whatever outward port the DBConfig said should be open.

If you want to see what a test looks like with all this put together (with tests powered by the substack/tape), here’s a chunk:

// DB configuration and postgres configuration that will be used to spawn the
// containerized postgres (POSTGRES) will be filled in later
let SHARED_PG: ContainerAndOptsWith<DBConfig>;

TestUtil.startContainerizedPostgresTest(test, cao => SHARED_PG = cao);

// Ensure creating & retrieving a user works (w/ db reset)
test("creating user works", async (t: Test) => {
  const db = await TestUtil.makeTestPostgresDB(SHARED_PG.extra);

  // Create the user
  const user = await db.registerUser(TestUserFixtures.john);

  t.assert(user, "user was created");
  t.assert(user.uuid, "user has a uuid");
  t.equals(user.version, 1, "user's initial version is 1");

  t.end();
});

NOTE startContainerizedPostgresTest actually runs a pseudo-test that starts up the database, it’s sitting at the top level and runs a test just like the test(...) call below it.

Alternatively, using GitLab Services

Of course, instead of doing all this, you could just use GitLab services and make sure you use your resources in a multi-tenancy safe way (clearing databases, resetting configuration, etc). Using containers lets me sidestep this requirement and run as closely to I do locally so I do prefer the hack(s) described above over trying to ensure that every dependency has enough multi-tenancy features to make this convenient for me.

Wrapup

That’s it for today! Hopefully this helps someone out who was doing something similar and may have gotten stuck or had things break because of this change.