Cluster Organization in Docker Compose
I'll make a long story short here: this time last year, I knew nothing about containers or orchestration save that Vagrant had sounded like a cool idea but hadn't done much for me in practice. But we had an architect who did know more, and who set up our applications with a really quite fancy Kubernetes- and Docker-based build and deploy system (more on that some other time, perhaps). Our dev and QA environments became Kubernetes clusters, I started learning how it all worked, things were good. Then he moved on, making myself and one other coworker who knew around as much as I did the de facto experts on everything cloud here. Oops.
One thing neither the architect nor I had anticipated was that many of our enterprise clients turned out not to be on board with Kubernetes. At all. Some of them aren't even comfortable with Docker period, but there's not much to do on that count except wait. For the rest, we decided that orchestration made things so much easier we were going to do it wherever we could get away with it, so we needed to have Docker Compose definitions ready to go.
I did inherit some basic Compose configs, but they were badly out of date; in the interim, we'd added a couple of Postgres extensions, integrated a single
A Rocky Start
Like jobs. I was about to miss jobs a lot.
In Kubernetes, jobs let you run one-off tasks. We use this functionality to stand up the database, run migrations, and seed initial data for the dev environment since we regenerate that every day. It works alright: deployment pods bounce off until the database comes up initialized. If something unexpected happens, kill the pod and Kubernetes starts another for you. So far, so good.
Docker Compose doesn't do that. In Docker Compose, things that start up are supposed to stay up, or be replaced if they don't stay up. This was a problem. I was looking for a way to issue a single docker-compose up
and have a brand new cluster with all the complicated once-off init stuff done for me. It'd be easy to expand the application server image entrypoint to do all the initialization, but each cluster runs two or three of those behind a load balancer, so just doing that could have inconsistent results from the spinup code firing multiple times.
Broken down, here's everything that needs to happen between the application services and the database when the cluster comes up, in order:
- If Postgres isn't running yet, don't start any app services.
- If the application database roles do not exist, create them.
- If the application database does not exist, create it.
- Deploy any outstanding migration scripts to the database.
- If the database infrastructure for single
sign-on does not exist, create it. - Deploy any new content to the database.
- Update the locale files for new content in each application container.
- Apply configuration to each application container.
- Start the application services.
Everything up through #6 needs to happen once and only once. But with Docker Compose, we don't have one-offs. So it all has to go in the entrypoint script, or near enough; we just have to make sure only one of the application services can execute the sensitive parts, and that its peers wait for that to happen before they spin up.
Scheduling Startup
The first problem to solve is making sure nothing tries to come up until the database is there. With Kubernetes, we use init containers for this: both the setup job and the application server deployment declare an init container which tries to select 1
every few seconds until it succeeds. Docker Compose doesn't have anything like that to my knowledge; the most it does is generate a dependency graph from your links
and depends_on
and a couple other service attributes. This ensures that services are started in a particular order, but since Postgres takes a couple seconds to come up the dependent containers could in fact finish their startup before it's ready.
The way to ensure nothing tries to talk to Postgres until it's good and ready is to wrap the startup command. The Docker Compose documentation recommends a few options; I went with wait-for-it. It looks like this in the Compose config:
entrypoint: ["./wait-for-it.sh", "postgres:5432", "--", "bash", "./entrypoint.sh"]
Our entrypoint.sh
is not run unless and until the Postgres container starts listening on its default port 5432. That's great, but there's one other thing that makes this really useful: since we already have multiple application services defined (Swarm isn't guaranteed so we can't set replicated mode), we can pick one of those to wait for Postgres to come up, and have the rest wait for it to come up in turn. That's our init container.
Secrets
At this point we can ensure that nothing that depends on the database will start up before the database is ready, and that one of our app services will always finish its startup before any others begin theirs. What we need now is a way to distinguish that service from the others so it can execute our once-only tasks. That's where secrets come in.
Secrets are basically the same concept between Kubernetes and Docker Compose: files containing sensitive data which get loaded onto nodes and mounted to the container filesystem. It's more secure than using environment variables. Secrets are defined as a top-level block in the compose config:
secrets:
db_owner_password:
file: ./secrets/db_owner_password.txt
And then attached to each service definition:
appserver:
image: myimage
links:
- postgres
secrets:
- db_owner_password
entrypoint: ["./wait-for-it.sh", "postgres:5432", "--", "bash", "./entrypoint.sh"]
The dependent application services don't need the db_owner_password
; that's only required to initialize the database. So we can test for the presence of the secret in our entrypoint script, and kick all that off only if it's present:
if [ -a /run/secrets/db_owner_password ]; then
# check and create the application roles and database, then run the migrations
fi
Now the appserver
service is unique, and we've restricted the ability to stage the database to it. We can't be completely careless -- if appserver
blindly emits a createdb
every time it starts, it'll fail with a "database already exists" error every time after the first -- but since we've guaranteed there will only ever be one container trying to create the database at a time, we can simply check up front.
That leaves the shared configuration and content, which together are more than secrets are meant to deal with.
Volumes
Mounting information from the host system into containers is a pretty general use case. Secrets cover a specific subset of this. For everything else, there's volumes (and again, Kubernetes' version of the concept is a pretty close analogue). Since volumes can be much larger than secrets, they aren't automatically shared across nodes; you have to create a named volume explicitly, and use a driver which is multi-host aware.
Declare named volumes for config and content:
volumes:
app_conf:
driver: local # this is obviously not multi-host aware, but it's good enough for testing
app_content:
driver: local
Then in the appserver
service definition, add a volumes
block:
volumes:
- app_conf:/home/appserver/app/conf
- app_content:/home/appserver/app/content
Docker Compose will create the volumes if they do not exist, or you can docker volume create
them ahead of time. Better to do the latter, since otherwise the first time you bring up the cluster everything will die horribly since the volumes are empty. If you create them manually, you can docker volume inspect
them, find the mountpoint on the host system, and copy the instance configuration and content in before you start spinning things up.
One caveat: the names app_conf
and app_content
are not actually the names Docker Compose looks for. Compose prepends docker_
to the names you supply, so the volumes should be named docker_app_conf
and docker_app_content
.
The End
Start to finish, it took me a few weeks to get my first real Compose cluster set up. It's rough getting started, even though the Docker and Compose documentation is quite good; it's a lot to wrap your head around, and there are a lot of concepts you really just have to sort of brute force your way into understanding. I had a lot of other stuff on my plate at the time (still do!), which certainly didn't help matters either.
The good news is, yesterday I had to set up another app with a similar Compose configuration from scratch. This time, I had it up and running within a couple hours. Once you've got the structure down and understand how the pieces fit together it's a lot more manageable.