Zero Downtime Docker Deployment with Amazon ECS

Earlier, I wrote about zero downtime docker deployment with Tutum. I have recently started experimenting with Amazon ECS (EC2 Container Service), which is Amazon’s offering for the orchestration of Docker containers (and thus a competitor to Tutum). I don’t have a lot of experience with ECS yet, but it looks pretty solid so far.

I found there was a real lack of documentation on how to deploy updated Docker images to Amazon, however, and was forced to do some research in order to figure out how it should be done. Through some diligent experimentation of my own with the AWS ECS CLI and not least some valuable help on the Amazon forums, I was able to come up with a script and requirements for the ECS configuration that together lead to streamlined deployment. As it turns out, you even get zero downtime deployment for free (provided that you use an elastic load balancer, mind)! It’s more magical than in the Tutum case, as ECS by default upgrades one node after the other in the background (hidden by the load balancer) so long as it is allowed to take down at least one node as required.


The requirements one must follow in setting up the ECS cluster are:

  • An elastic load balancer
  • An idle ECS instance, or a service deployment configuration with minimumHealthyPercent
    of less than 100, (so that ECS always has an extra node to deploy the new task definition to)

Additionally, if you are using a private Docker image registry, you must add authentication data to /etc/ecs/ecs.config in your ECS instances and reboot them. In the following example, I assume as the private registry:


You are also required to log into your Docker image registry and configure the AWS CLI on the machine where you are to run the deploy script.


The procedure implemented by my script (deploy-ecs) looks as follows:

  1. Tag Docker images corresponding to containers in the task definition with the Git revision.
  2. Push the Docker image tags to the corresponding registries.
  3. Deregister old task definitions in the task definition family.
  4. Register new task definition, now referring to Docker images tagged with current Git revisions.
  5. Update service to use new task definition.

3 Responses

  1. Jon Nordby
    Jon Nordby January 23, 2016 at 1:34 pm | | Reply

    Hi Arve,
    we used this existing probject to solve the same problem
    (on NM)

  2. Vincent baronnet
    Vincent baronnet June 15, 2017 at 4:42 am | | Reply

    Using terraform, we use this exact workflow, but we do experience short downtime (around 30 seconds). Just saying for now we didn’t find a solution, it seems that’s because the way ECS work. There is a window when task are stopping and new task are launching, where no container are running.

Leave a Reply