by Douglas Campbell, Principal Engineer
Like many dev groups, we love our ops team and consider them top notch, capable, and imbued with a ton of dev ops foo. Nonetheless, we’ve been on the prowl for better and better ways to deploy code to prod reliably and without time consuming 1 on 1 sessions sitting in front of each other’s terminal. In short, too much people talk, not enough process to process talk.
Enter docker as a candidate for addressing this.
Find a way forward
But exactly which part of our system? We needed to find a crack into which we could apply the leverage docker is supposed to provide.
Basic idea is to identify a couple machine profiles which can support the applications we repeatedly deploy. In theory, this should pave the way to easier creation and destruction of environments. Sounds simple right?
What we’ve found so far is that the simple case is when you’re standing up a brand new system and it doesn’t need to interact with any existing systems running on the box.
The trickier has been the opposite i.e. run docker on box with other apps/processes which haven’t been dockerized yet and integrations with our build process i.e git, jenkins, private docker repo, and existing deployment process.
Simple Case – seyren
Ok – so the “simple” case. Our team needed a self serve dev configurable alerting system which could hook into graphite metrics generated by our apps. At time of this writing, we’re going with Seyren –
.Seyren is a configurable alerting dashboard for Graphite. We already had Graphite up so no need to dockerize it at this time. Seyren depends on mongo. How to express this with docker? And what are the actual dependencies?
- startup order i.e. mongo goes before seyren
- host and port of mongo are needed by seyren once it starts up
By using the –name <name> argument to docker run, the named container is available to be linked to i.e. depended on or aliased for other containers running on the same host.
So we startup up our mongo container and give it a name of mongo i.e.
docker run –name mongo … mongo
Next we link to it with our seyren container by using
docker run –link mongo:mongo –name seyren … seyren
Running a container with the link arg ensures that a container with the referenced name is running. Additionally, linking establishes environment variables within the container doing the link with all the IP and PORT information of the linked container.
One can see this by opening a bash shell into the linking container and executing “env” as follows:
LC2QN2066F69W:docker douglas$ docker exec -t -i seyren bash
[root@1b549dee8347 /]# env | grep MONGO
Docker uppercases the container name and concatenates it with any exposed ports to create the above set of env variable. So now our seyren startup scripts can see these values and hookup with mongo as desired.
we were able to manage all the above dependencies in a single fig.yaml file without having to use the docker client shell. Rather than invoking docker a bunch of times, you just call
And all your containers are started in the right order and env variables set in the right containers.
We’ve got Seyren up and ready for configuring of alerts in production so we’re quite pleased. We’ve still got bomb proofing to do. That said, bomb proofing is often driven by recoverability. Our Mongo image uses a local volume which can be backed up to preserve state or transfer to another box.
Let me first say that the following case has been labeled tricky as a result of actual events and not anticipating that it would be tricky. The use case here is loading our decorated share widget logs into BigQuery as part of our active data warehouse initiative.
We publish raw share, click, and page view logs from our widget api servers to kafka topics, perform in flight annotations to enhance the data, and then publish to s3 in 5 minute batched newline separated compressed gz files.
We just needed to double publish to Big Query as well. Publishing directly from these boxes was chosen because of its relatively low latency i.e < 10 minutes from share to availability in big query and cost reasons. Transfering data out of amazon via bq streaming would incur 5x the cost of posting gz compressed json to big query directly. It also avoids intermediate storage in Google Cloud storage.
Initially, docker wasn’t targeted as a candidate. It didn’t seem like a good candidate for that as it’s “just some scripting”“ and data export. What could be simpler? As always, the devil is in the details.
I developed a few little wrapper scripts for service account activation and the actual file posting with the bq tool and had a solution working on my dev box. Rather than rolling our own biq query uploader using java or python or rest api, we opted early on to use bq from the google cloud sdk.
We tried to install it on a machine with an identical profile as our widget log decorators and ran into a quagmire of python library dependency issues. These all stemmed from the default install of python being in 2.6 and having difficulties installing pyOpenSsl which could be used with the Python 2.7 exe.
So now our “simple scripting” program had triggered a cascade of upgrade requirements to the cluster doing our log decoration. In effect, the common host to our log decoration and log upload processes was forcing non-existent dependencies on us. Ugh.
I saw this as a crack into which docker might fit. I went back to my desk, dockerized the upload scripts and supporting python libs with a centos7/python2.7 image, and we had the upload process running 10 minutes after publishing to our private registry.
Our DevOps was (initially) convinced.
Next step was to put the docker run command under runit’s,
docker run -v /mnt/upload:/mnt/upload -t -i loguploader sh /bq_upload_wrapper.sh
That was easy so we left it running over the weekend. Saturday around 11, we started getting disk space alerts from our test host. Our files slated for upload were piling up due to some non-recoverable error. So what had happened?
Here’s the pseudo code for the upload script
fetch bq p12 credentials from secure location || exit 1
initialize service account with gcloud || exit 1
for (file in *.gz) // moved atomically there by another process
upload file && rm $file
We started seeing:
Unexpected exception in load operation: [(‘asn1 encoding routines’,
‘ASN1_CHECK_TLEN’, ‘wrong tag’), (‘asn1 encoding routines’, ‘ASN1_ITEM_EX_D2I’,’nested asn1 error’)]
As best I could tell, this pointed to corruption of the service accounts p12 credential file. The error occurred during refreshing of the access token. We shut down the test for the weekend.
Come Monday, I decided to remove the while (true)
. This had the effect of exiting the container and relaunching a brand new one.
We updated the docker run to include a –rm flag in order to automatically remove the created container. Everything worked perfectly!
For about an hour that is. We chased down and worked around a cascade of issues.
sh: not found
Somehow, once the container was launched, the sh command within in it disappeared. This caused runit to respawn our container endlessly which is what it is supposed to do if the program exits, but no actual work was being completed.
Corrupted python install/*
Various pieces of python dependent libraries in the container started vanishing.
All these pointed to some sort of cannibalization of the download image itself. Possible explanations
- too many uncleaned up containers
- docker 1.0 era bugs
Upgrade docker to 1.4.1
Instead of extended forensics into the cause of each, we convinced ourselves to retry all with the latest version of docker 1.4.1. We downloaded the latest from the binaries download page off the docker site,
and replaced the stock docker which comes with our amazon image id – ami-64867b0c
docker “run –rm” woes
With 1.4.1 in place, all the bizarre errors went away and were replaced by a more manageable and understandable set of issues. One of those is the result of the way –rm is implemented.
We want to name our container with it’s logical name and not the name given by the client and we also wanted to wipe away the container after each run in order to avoid unwanted accumulations of unneeded binary deposits.
The idea behind –rm is simple. Remove the newly provisioned container after you’re done. So whenever the CMD <https://docs.docker.com/reference/builder/#cmd> exits, just remove the container which was created based upon the image specified by the run command.
done in the client
Remember that docker is actually a client and a daemon. It turns out that –rm flag is implemented in the client. This surprised me when I found that out on the docker irc channel. Doing it in the client seemed unfortunate simply because a kill -9 can’t be intercepted and thus the client, in some cases, just will not be able to clean up.
Two cases (at least) happen as a result of a container not getting removed.
a dock with no name
The first is when –name has not been specified. This results in a container leak which ends up leaving bytes on disk which can’t be claimed even though they won’t be reused.
a named dock
The second is when the –name option is specified. That’s my case. You get a very understandable error message stating that you can’t provision another container because one with that name already exists.
A possible suggestion in the named case i.e. when –rm and –name are both present would be to interpret both flags’ presence as a clear go ahead to clean up any existing container by the –name.
Bottom line is that –rm flag, didn’t work for us. I don’t recommend using it because it’s much easier to do yourself. I asked around a bit on irc and apparently performing the clean up in the daemon has its own set of issues as well. That’s also understandable too as the client daemon disconnects can occur just as abruptly as the client being killed.
Jenkins Gradle integration
For runit, calling docker rm prior to docker run is trivial so we moved on to integrating our use of docker with the rest of our build, deploy, and versioning system. We distribute our apps
We’re not done but believe that we have captured the kernel of something good.
Our Big Query uploader is in the same repo as the log producer and kafka consumer. Thus, we want to tie their versions together but the deployment mechanism in both cases is different. One is a java daemon.
We use annotated tags along with git describe to generate versions that look like this
So in order to integrate with our java daemon pushes via tgz files stored in s3 (packaged by distribution gradle plugin), we name the images we push to our private registry with the same tag.
gradle task – build, tag, push
There are some plugins available to assist with use of docker but it was so simple just to use exec that we went ahead and put a build, tag, and push set of tasks in a common.gradle file. All it does is look for a src/docker folder and iterates over each folder contained therein and invokes docker to build, tag, or push the image corresponding to the git describe tag.
Conclusions and next steps
Thursday came and went as we worked through the issues listed above. Seyren is running just fine in its long running container. The Big Query uploader is also ticking along feeding data into our active warehouse solution with nice regularity.
Fig seems like a nice way to manage multiple containers on a single host and runit is working well for short lived programs.
All that said, there may be a couple more stones left to turn over.
We came in on thursday after leaving the docker container running under runit with –name and –rm removed in favor of cleaning up ourselves.
Disk usage on the root partition had increased from it’s pre-docker level but was stable i.e. not increasing. Poking around in /var/lib/docker/mnt, a folder full of folders named after current and previously provisioned container ids, showed that only one of them had any actual files in them. That consistent with having at most one active container running at a time. current and previous.
We’re happy about that but we’re looking into why more mounts are being held onto than we expect.
For instance, out Big Query uploader run at most every couple seconds and yet, df -h shows 5 mounted partitions. We’ll dig into this.
Different storage layer evaluation
Developers requirements for their environments are naturally less stringent than production systems. There are a number of different options for the storage layer other than devicemapper which we plan to explore and research.
stage integration environment
We’ll be looking at docker to handle both of these. We’ll put one or more jenkins jobs inline with our build process to act as gating steps to deployment.
It’s sunday now. Docker’s chugging away. Our sprint starts tomorrow and it’s full of docker work.