Why You Should Use Multi-Stage Docker Builds in Production
This is a cross-posting from https://itnext.io/using-multi-stage-docker-builds-for-speed-and-security-9d3a1cd9cd8c
It’s not too often that speed and security combine forces, but when they do, it’s a surefire way to know that a pattern is worth adopting. It’s like a car part that looks good and makes your car faster — how can you go wrong? In this post (with copious amounts of examples), I’m going to show you what multi-stage Docker builds are and how they are faster and more secure!
For the source code in this article, please refer to this GitHub repository.
How Dockerfiles Work
Docker containers are usually built with a Dockerfile, a set of instructions that help you package your source code, install dependencies, and build your application (if it compiles a binary). However, a lot of times the things you need to build your application aren’t the things you need to run your application. Let’s consider a standard Node Dockerfile from the Nodejs website.
// Node Sample Dockerfile - Single Stage
FROM node:12ADD . /app
WORKDIR /app
RUN npm install
EXPOSE 8080
CMD [ "node", "server.js" ]
You’ll notice that before we start working with directories and copying files into the image, we start with FROM node:12
. You see, Dockerfiles are like a giant onion and the first FROM
is the core of your onion. It gives you the binaries and Linux file structure that you need to keep adding more layers which will eventually be your final application. However, what’s inside the core? Let’s run a bash shell inside of the node:12
image to find out!
A Look Inside Our First Layer
$ docker run -it node:12 /bin/bash
# uname -a
Linux c415d0a3fb27 4.19.76-linuxkit #1 SMP Tue May 26 11:42:35 UTC
// Ok so we're running Linux
# ls
bin boot dev etc home lib lib64 media mnt opt proc root run sbin srv sys tmp usr var
// Standard file system in Linux
# ls /bin
ps su rm kill ping sh sed stty chmod chown chgrp bash date pwd ls which mv
# ls /usr/local/bin
docker-entrypoint.sh node nodejs npm npx yarn yarnpkg
As you can see, node:12
is a great place to start building our application. It’s got everything we need to create a file system, change ownership of directories, and of course contains npm
so we can install our node_modules
dependencies. But why do we need rm
, kill
, mv
, or ping
? Even more so, our application calls node
to run server.js
, why do we still need our package manager npm
, which can install anything under the sun?
The answer is simple — having all of these tools and the feel of a full Linux operating system is great for getting started — but it’s also insecure to have all of those binaries and lugging such a large file around is slow. From here we can do one of two things (1) rip everything out by deleting everything we don’t need or (2 ) copy only the things we need and move those into a second, fresh stage. Enter… multi-stage Dockerfiles!
Multi-Stage Dockerfiles
In the last example we saw how convenient it was to have a huge set of tools when we were building and installing our application and its dependencies, but we knew we didn’t want to have all of that bloat when the final container gets delivered to our source destination (most likely, Kubernetes). Below is a multi-stage Dockerfile:
// Node Sample Dockerfile - Multi-stage
FROM node:12 AS stage1
ADD . /app
WORKDIR /app
RUN npm install
#Second Stage us
FROM gcr.io/distroless/nodejs
COPY --from=stage1 /app /app
WORKDIR /app
EXPOSE 8080
CMD ["server.js"]
In the first stage (stage1) we are using the node:12
image to start. This gives us a great base to build our application. However after we have copied our code to /app
and ran npm install
, we move onto a second stage (the second FROM
) and pull the Node Distroless image (gcr.io/distroless/nodejs). Distroless is an amazing minimal docker image — here’s a great description from their GitHub:
"Distroless" images contain only your application and its runtime dependencies. They do not contain package managers, shells or any other programs you would expect to find in a standard Linux distribution.
After we pull the Distroless image, we COPY
from our first stage the contents in /app
which should be our node_modules
as well as our source code server.js
. The important part here to note is that the new image doesn’t contain bash
or any other tools with which you might want to exec
. Whilst inconvenient for debugging, the attack surface has been dramatically reduced — which is perfect for a production deployment of your container. From the outside, the application is doing the same thing as before (in this example, serving up a webserver on port 8080).
Debugging the Container
To avoid the inconvenience of not being able to debug your container, Distroless provides an alternative :debug
tag of their images which contains shell access (via BusyBox shell). If your application is running into troubles, you should keep a debug version of your Dockerfile that you can deploy in case you need to kubectl exec
or docker exec
into your container. Here’s an example of that below for reference (also in the GitHub):
// Node Sample Dockerfile - Multi-stage with Shell for Debugging
FROM node:12 AS stage1
ADD . /app
WORKDIR /app
RUN npm install
#Second Stage us
FROM gcr.io/distroless/nodejs:debug
COPY --from=stage1 /app /app
WORKDIR /app
EXPOSE 8080
CMD ["server.js"]
What About Size?
Alright, up until this point we’ve just gone over the technical difference between the two Dockerfiles from a security lens. But as promised in the title, what about the size? Since multi-stage cuts out all of the unnecessary clutter by copying only the things we need, let’s inspect the size of the respective images to verify that claim:

Wow! Our new image is 74.3MB compared to 921MB. Having a smaller images size speeds up all of our build and deployment steps, and if you’re in Kubernetes, smaller images are one of the quickest ways to speed up performance all around.
Summary
As you’ve seen, a multi-stage Dockerfile separates your build into two parts — (1) the setup of your application code (such as dependencies) and (2) the setup of your application runtime . Dockerfiles are extremely easy to use out of the box, but when when we are optimizing for security and speed, multi-stage builds are what is needed to deploy production-ready applications. As well, we used the Distroless images to make sure our final images contained that which we need to run our application and only that! Please reach out to me at bryant.hagadorn@gmail.com if you wish to discuss further and follow me on Medium if you like reading about these types of things! Thank you.
References
https://itnext.io/using-multi-stage-docker-builds-for-speed-and-security-9d3a1cd9cd8c
https://github.com/docker-slim/examples/tree/master/3rdparty/node12_express_official