Use Cases for Docker Multistage Build
The Multistage Build feature has made it possible to write Dockerfiles very concisely.
I was impressed, so I'm summarizing the exciting points for each use case.
It's been quite some time since I wrote the memo for this content, so this article has become very outdated. Multistage Build has been available since the version released in May 2017, so it's been nearly a year...
Also, there's an article about Multistage Build on the official GCP blog. It's an interesting article with measurements, so please refer to it first. It explains Use Case 1 of this article in detail.
Use Case 1: Reducing Image Size
This is an indispensable benefit when discussing Multistage Build. By copying the artifacts (and necessary libraries), you can exclude programs needed only for building and unnecessary layers from the image.
While the above is a common benefit for many containers, it's particularly powerful for Go language programs. This is because Go programs can run by just copying a single binary, allowing for extremely lightweight images.
It becomes possible to create images of just a few MB, which was previously unachievable.
before
FROM alpine:latest as builder
...
// build as command
CMD command
Hundreds of MB~
after
FROM ... as builder
...
// build as command
FROM ...
COPY --from=build command command
CMD command
A few MB~
Use Case 2: Programs Used Only in Development Environment
There are programs that are valuable in development environments but not desired for release, such as hot reload tools and debuggers.
Without Multistage Build, it was necessary to prepare two Dockerfiles (like Dockerfile.dev
and Dockerfile.prod
). Since the preparation for building was the same in both files, it was necessary to change both, which was a cause of reduced maintainability.
With Multistage Build, it becomes possible to exclude programs used only in the development environment from the image with a single Dockerfile, while keeping the code for building the same. The modification when using from docker-compose is also minimal.
Personally, I think this is the most convenient use of Multistage Build.
before
Dockerfile.dev
FROM ...
// Preparation for build
// Build
// Install hot reload tool
CMD hot-reload command
Dockerfile.prod
FROM ...
// Preparation for build
// Build
CMD command
after
FROM ... as builder
// Preparation for build
// Build
// Install hot reload tool
CMD hot-reload command
FROM ...
COPY --from=build command command
CMD command
Use Case 3: Refactoring
As a byproduct of implementing the above two, the dependencies of each step become clearer.
For example, consider a case where you build two programs and run a program using their artifacts (building plugins, creating local certificates in each stage). Conventionally, to reduce image size, we would build continuously in one layer of the Dockerfile and execute at the end.
With Multistage Build, you can separate dependencies into each stage, resulting in a more understandable Dockerfile.
before
FROM ...
RUN apk pre-cmd1 pre-cmd2 cmd3 ... && \
build cmd1 ... && \
cd ... && \
build cmd2 ... && \
...
CMD cmd1 cmd2
after
FROM ... as cmd1
RUN apk pre-cmd1 ... && \
build cmd1 ...
FROM ... as cmd2
RUN apk pre-cmd2 ... && \
build cmd2 ...
FROM ...
COPY --from=build cmd1 cmd1
COPY --from=build cmd2 cmd2
CMD cmd1 cmd2
Summary
Multistage Build is a very useful feature that can simplify complex Dockerfiles. Let's aim for lightweight and easy-to-understand Dockerfiles and make effective use of it.