Making the Case for Static Container Images in AWS ECS
Brien Posey details several reasons why a container image should not be dynamically updated.
In today's world, dynamic software updates are commonplace. Software vendors routinely update their products as a way of addressing security vulnerabilities, fixing bugs or adding new features. When it comes to AWS ECS container images however, dynamic updates are usually a bad idea.
One of the big reasons why Amazon recommends that you do not dynamically update a container is because doing so can adversely impact your workload's ability to scale to accommodate demand spikes.
Imagine for a moment that you have a workload consisting of a collection of containerized web servers and that a load balancer distributes inbound web requests across the available containers. For the sake of this example, let's also assume that AWS has been configured to deploy additional container images as a way of dealing with workload demand spikes.
Even in ideal circumstances, workload scaling does not happen instantaneously. The scaling process is usually tied to CloudWatch monitors, meaning that the workload cannot scale until a performance threshold has been crossed. Even then, it can take a few minutes to create the additional containers and bring them online. So with that in mind, imagine that the newly created containers also had to download a bunch of dependencies and updates before they could begin servicing a workload. That requirement would mean that the scaling process would take even longer. The demand spike might be over by the time that the newly created containers have downloaded everything that they need and are ready to service the demand.
Another reason why you should avoid creating container images that dynamically download external content is that doing so can undermine your application's reliability. This can happen in a few different ways.
One of the most basic ways that dynamic downloads can impact workload reliability is that the downloads themselves introduce potential points of failure. Imagine for a moment that your workload consists of ten container instances and each one needs to dynamically download five dependencies at startup. That's a total of 50 downloads across all of the instances. Being that any one of those downloads could potentially fail, there are 50 chances for failure that would not exist if the container used only static code.
Even if all of the downloads work perfectly however, dynamically downloading container dependencies can compromise application reliability in another way. Before I tell you what could potentially happen, I need to take a quick step back and talk about one for the most basic and fundamental aspects of containers.
One of the things that make containers so attractive to developers is that containers allow for application portability. A developer can create and test a containerized application on their laptop and then move the application to the cloud without having to worry about changing any of the application's code. So long as the developer has done a good job of testing their application there is no need to worry about whether or not the application will work in production. The application will work in exactly the same way in the cloud that it did on the developer's own laptop.
With that in mind, consider a situation where a developer builds and tests a containerized application, but this time let's pretend that the application is designed to download dependencies from an external source. The developer thoroughly tests the application to make sure that it works properly and then uploads the application to the AWS cloud, where the application continues to work as intended. One day however, the vendor who created the dependencies makes a change to their code. The application downloads the dependencies as it always has, but the new version of the dependencies is not compatible with the application, causing the application to fail. At this point, an outage occurs and the developer will need to figure out a way to fix the problem.
If the dependencies had been statically included in the container image, then the problem never would have happened. If the vendor released a new version of the dependencies, the developer could test the application with the new dependencies. If everything works the way that it is supposed to then the developer could simply create a new container image. This approach eliminates the possibility of an application running untested code.
All of this is to say that the number one reason why a container image should not dynamically download dependencies is because static images that have been thoroughly tested can be treated as a "known good" image. That means that if a problem occurs within a container then the organization can simply terminate that container instance and then spin up a new instance from the known good container image, knowing that the new container instance will work perfectly. However, when you start introducing unknown code through dynamic downloads, then you can no longer guarantee the integrity of your container images.
Brien Posey is a 21-time Microsoft MVP with decades of IT experience. As a freelance writer, Posey has written thousands of articles and contributed to several dozen books on a wide variety of IT topics. Prior to going freelance, Posey was a CIO for a national chain of hospitals and health care facilities. He has also served as a network administrator for some of the country's largest insurance companies and for the Department of Defense at Fort Knox. In addition to his continued work in IT, Posey has spent the last several years actively training as a commercial scientist-astronaut candidate in preparation to fly on a mission to study polar mesospheric clouds from space. You can follow his spaceflight training on his Web site.