Dockerizing Legacy Scoop for Reproducible Development Environments

Sept. 14, 2020

That title is a bit of a mouthful, huh? Probably a good idea to start with some context. "Scoop" is the name of the CMS at the New York Times, which is actually made up of a number of separate frontend apps that sit atop a shared backend. Most of the time I work on Oak, the collaborative rich text editor that the newsroom uses for writing news and opinion stories. Oak actually has its own datastore (Google's Cloud Firestore) and backend, but presently the only way to be a part of the Scoop ecosystem is to also store at least a subset of our data in the shared backend.

This leads to an unfortunate situation. Most Oak developers work primarily in Javascript, on lightweight node.js applications that are fairly easy to install and get running. Only very infrequently does one of us need to start up one of the main backend applications (named cms-api, cms-publishing, and cms-web). These applications are written in Java, a language unfamiliar to most Oak devs, and rely on a number of specific system setup quirks, like the locations of various SSL certificates and specific JDK and Maven versions. There's a beastly Ansible script that's meant to set up all of these requirements on each dev's machine, but because it's doing so much, and because these apps are touched so infrequently, more often than not it's broken when it comes time to use it.

Warning Shots

This was the state of the world when a new developer joined our team earlier this year. They weren't new to the Times, but they were new to Scoop, so they had an existing Times laptop, but had yet to run the cms-devtools setup.sh script. One day, they courageously picked up a ticket that required spinning up cms-api locally. Three full days later, after spending hours conversing with the platforms team that owned the setup script, and hours more bashing their head against their keyboard, they were still unsuccessful. Despite everyone's best efforts, there was now a class of work that one Oak engineer was permanently unable to contribute to.

Other engineers were still able to run cms-api locally though, so we finished the work and moved on. But then it happened again: an engineer that had previously been able to run the backend apps with no issues suddenly couldn't. Three more days of furious debugging ensued. The problem was probably because the backends had migrated to asymmetric keys for authentication... but following the steps that should have installed the new keys didn't resolve it. The number of engineers on Oak that couldn't run a critical dependency of our software locally was growing alarmingly; after the second one, myself and two other engineers tried only to realize that we were having the same issue!

A One-Two Punch: Docker and Bash Scripts

I have no idea why I'm using boxing lingo, you'll have to bear with me.

This had quickly gone from worrisome to untenable. The problem was pretty clear: the local development environment for the Scoop backend apps simply wasn't reproducible. Luckily, this is the very problem that container technology like Docker was built to solve! What we need is a Dockerfile that sets up an environment that we can use for local development. Then we can just mount our source code directories as volumes and voila, everything should always work.

The good news is that these apps are deployed with Kubernetes, which itself is powered by Docker, so there's already an ecosystem of Dockerfiles and configuration for us to pull from. Specifically, there's a base docker image, cms-gke-jvm-base, that installs the appropriate version of JDK (OpenJDK 11, in this case), creates some environment variables and directories that are used by all of the backend apps, and installs GCSFuse (a "user-space filesystem for interacting with Google Cloud Storage"). We'll start with that as our base:

        
FROM cms-gke-jvm-base:openjdk11-latest
        
      

The next piece of the puzzle is installing Maven, the dependency manager we use for our Java projects. This piece took me a little while for somewhat silly reasons.

        
ARG MAVEN_VERSION="3.6.3"
ARG USER_HOME_DIR="/root"
ARG APACHE_MIRROR_BASE_URL="http://apache.mirrors.pair.com/maven/maven-3/${MAVEN_VERSION}/binaries"
ARG APACHE_CENTRAL_BASE_URL="https://downloads.apache.org/maven/maven-3/${MAVEN_VERSION}/binaries"

RUN mkdir -p /usr/share/maven \
    && curl -Lso /tmp/maven.tar.gz ${APACHE_MIRROR_BASE_URL}/apache-maven-${MAVEN_VERSION}-bin.tar.gz \
    && curl -Lso /tmp/maven.tar.gz.sha512 ${APACHE_CENTRAL_BASE_URL}/apache-maven-${MAVEN_VERSION}-bin.tar.gz.sha512 \
    && echo "$(cat /tmp/maven.tar.gz.sha512)  /tmp/maven.tar.gz" | sha512sum -c - \
    && tar -xzC /usr/share/maven --strip-components=1 -f /tmp/maven.tar.gz \
    && rm -v /tmp/maven.tar.gz \
    && ln -s /usr/share/maven/bin/mvn /usr/bin/mvn

ENV MAVEN_HOME /usr/share/maven
ENV MAVEN_CONFIG "${USER_HOME_DIR}/.m2"
        
      

The piece that I kept messing up here was the checksum validation. Firstly, we're downloading the actual Maven binary from a mirror, but (for security reasons that made a lot of sense when I stopped to think about them) the checksum can only be downloaded from the central Maven repository. What really stumped me though was the proper arguments to sha512sum; there need to be two spaces between the checksum value and the name of the file.

With JDK provided by the base image and Maven installed by our RUN step above, we're actually pretty close to being able to start up our apps. The environments for these apps are so similar, I have a hunch that we can completely reuse this Dockerfile for all of them. Let's add a build argument that allows the builder to specify which app they're building this image for.

        
ARG APP_NAME="cms-api"
ENV APP_NAME ${APP_NAME}
        
      

There's a start.sh script in each of the app repos that runs Maven with the correct arguments. It needed to be modified a bit to work correctly in the Docker environment. We haven't actually added the source code to the image (we'll mount it as a volume later), but now that we have the app name available as environment variable, we can use it to tell the start script where to run.

        
#!/usr/bin/env bash
pushd "/opt/${APP_NAME}" || exit

MAVEN_OPTS_LIST=(
    "-Xms256m"
    "-Xmx512m"
    "-Dhazelcast.jmx.detailed=true"
    "-Dhazelcast.wait.seconds.before.join=3"
    "-Dhazelcast.health.monitoring.level=OFF"
    "-Dhazelcast.max.operation.timeout=10000"
    "-Dcms.logs=/var/nyt/logs/cms"
    "-Dsun.net.inetaddr.ttl=5"
    "-Dsun.net.inetaddr.negative.ttl=5"
    "-Dfile.encoding=UTF-8"
    "-Dcom.sun.management.jmxremote"
    "-Djetty.port=${JETTY_PORT}"
    "-Dcms.host=localhost"
    "$@"
    )

MAVEN_DEBUG_OPTS_LIST=(
    "-Xdebug"
    "-Xnoagent"
    "-Djava.compiler=NONE"
    "-Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=*:8000"
)

if [[ -z $SCOOP_DEBUG ]]; then
    export MAVEN_OPTS="${MAVEN_OPTS_LIST[*]}"
else
    export MAVEN_OPTS="${MAVEN_OPTS_LIST[*]}${MAVEN_DEBUG_OPTS_LIST[*]}"
fi

mvn -P dev compile exec:java
        
      

I also added a debug mode, which can be triggered by setting the SCOOP_DEBUG environment variable. That way developers can connect their text editors and IDEs to the running app to add breakpoints and do hot code reloading. Now we can add this script to our Docker image and set it as the entrypoint, and we should be ready to go!

        
COPY ./scripts/start.sh /opt/start.sh
ENTRYPOINT ["/bin/bash", "/opt/start.sh"]
        
      

Or... maybe not quite. There are still a few things preventing this Dockerfile from actually being useful as a development environment. The most obvious is the lack of any code for Maven to compile and run in these containers. We're also missing a number of environment variables that need to be set in order for the startup script to work properly.

In the ideal scenario, I don't think developers should have to think about these pieces of configuration at all. I've grown very fond of the notion of "sensible defaults", especially in a situation like this, where it's unlikely that developers will need to vary from the defaults at all. In that spirit, let's create a docker-compose.yml file that defines a service for each backend that we need to be able to spin up, with all of the correct environment variables and volumes.

Let's start with cms-api:

        
version: "3.8"

services:
  api:
    build:
      context: .
      dockerfile: ./Dockerfile
      args:
        APP_NAME: cms-api
    ports:
      - 8000:8000
      - 8089:8089
        
      

We have our build argument, cms-api, which we set up earlier in our Dockerfile, and we're exposing two ports. 8089 is the port that the web server listens on, and 8000 is the port that the debug server listens on. There are a bunch of environment variables to set up next; I'll just call out the important ones.

        
    environment:
      JETTY_PORT: 8089
      SCOOP_DEBUG: ${debug}
      CONFIG_SERVER_PASS: ${CONFIG_SERVER_PASS}
      GOOGLE_APPLICATION_CREDENTIALS: /opt/svccreds/json_token
        
      

The JETTY_PORT has to be set to the same value as the port we exposed in ports, and SCOOP_DEBUG is the environment variable we used earlier to tell Maven to run in debug mode. Here we've set it to the value of ${debug}, which means users can toggle it on by running debug=true docker-compose up, which feels pretty useable as an interface.

CONFIG_SERVER_PASS and GOOGLE_APPLICATION_CREDENTIALS are two different kinds of secrets. One of the most fragile parts of the old system was making sure that all of the relevant secrets were available and installed in the correct places, so something I wanted to think about carefully with this new system was how to keep secret maintenance simple and maintainable. These two variables are actually good examples of the two different types of secrets that this system relies on. The password for the config server is stored directly in the CONFIG_SERVER_PASS environment variable, whereas GOOGLE_APPLICATION_CREDENTIALS holds a path to a file that contains secrets.

A Quick Tangent: Secrets Management

Most secrets at the Times are stored in a self-hosted Vault instance. Vault provides a CLI for interacting with secrets, which is excellent for this kind of scripted automation. We also have a cert stored in GCS, which we can use the gsutil CLI to access. We're also going to make use of jq and sed for manipulating the CLI results and piping them into the appropriate locations in the resulting files.

To install these secrets, we're going to write a bash script, install-secrets.sh. The first thing we need to do is create a .env file for the secrets that need to be available as environment variables, like CONFIG_SERVER_PASS. We can add this file to our .gitignore so that it's not checked in to version control, and docker-compose will automatically make it available in our docker-compose.yml file.

        
vault read path/to/config-server-pass --format=json \
      | jq -r '"CONFIG_SERVER_PASS=" + .data.value' > .env
        
      

The --format=json option lets us pipe the output of the vault command directly to jq, which lets us read a specific object path from the resulting JSON object and format it as an environment variable in our .env file. We can repeat this pattern for the rest of our secrets, too, piping our secrets and certificates to files in a local directory.

In general, we want to avoid copying secrets into Docker images, since they're often published and can easily be inspected. Even though these images will likely only ever be built and used locally, certificates and passwords are likely to be rotated over time, and it would be nice if our users didn't have to rebuild their images when that happens. We can get around this by mounting our secrets files as volumes onto our running container, which again is fairly straightforward in our docker-compose.yml:

        
    volumes:
      - ./secrets/json_token:/opt/svccredds/json_token
      - ./.m2/settings.xml:/root/.m2/settings.xml
        
      

There are a few more volumes we want to mount for convenience. The first is ~/.m2/repository. Maven uses this directory to store all of its locally installed depndencies; by mounting the same directory from the host machine to each container, we can reuse the dependencies between containers. That means if a dependency is installed to run cms-api, it won't need to be reinstalled to run cms-publishing.

The last two volumes we need to mount are the ones containing the actual source code! In order for this to work, we need to make sure that our code repositories are checked out in predictable locations. We can do that by installing them on our users' behalf with a simple setup script:

        
#!/usr/bin/env bash

git clone https://github.com/path/to/cms-core.git

git clone https://github.com/path/to/cms-api.git

git clone https://github.com/path/to/cms-publishing.git

git clone https://github.com/path/to/cms-web.git

# If the user has maven insalled, this will
# already exist, but we want to double check
mkdir -p ~/.m2/repository

./scripts/install-secrets.sh
        
      

Now we always know where our code will be, so we can mount the rest of our volumes!

        
    volumes:
      - ./secrets/json_token:/opt/svccredds/json_token
      - ./.m2/settings.xml:/root/.m2/settings.xml
      - ~/.m2/repository:/root/.m2/repository
      - ./cms-api:/opt/cms-api
      - ./cms-core:/opt/cms-core
        
      

The Knockout

In the end, we have a fairly simple repository with only a tiny amount of code. Including the README.md, we have seven files, averaging only about 40 lines of code each. The prerequisites (installing git, vault, Docker, and jq) are commonly available and easy to install. The repo itself uses common best practices and is easy to understand and modify, and it wasn't all that hard to write, either!