Getting Started with Docker Multi-Stage Builds

Table of Contents

Docker is a popular containerization platform used to create, deploy, and manage applications. One of the features that make Docker so powerful is its ability to create multi-stage builds. Multi-stage builds are a way to optimize the Docker build process by reducing the size of the final image and improving performance. In this blog post, we will discuss what multi-stage builds are and how to use them in Docker. We will also walk through a simple example of a todo list app to demonstrate the benefits of using multi-stage builds.

What are Multi-Stage Builds?

Multi-stage builds are a feature introduced in Docker 17.05. They allow developers to create a Dockerfile that defines multiple stages for building an image. Each stage can have its own set of instructions and build context, which means that the resulting image can be optimized for size and performance. In a multi-stage build, each stage produces an intermediate image that is used as the build context for the next stage. The final stage produces the image that will be used to run the application.

The key advantage of using multi-stage builds is that it allows developers to reduce the size of the final image. By breaking down the build process into smaller stages, it becomes easier to remove unnecessary files and dependencies that are not needed in the final image. This can significantly reduce the size of the image, which can lead to faster deployment times and lower storage costs.

In addition to reducing image size, multi-stage builds can also improve build performance. By breaking down the build process into smaller stages, Docker can cache the intermediate images and reuse them if the source code or dependencies haven’t changed. This can lead to faster builds and shorter development cycles.

Let’s now take a look at a simple example of a todo list app and see how we can use multi-stage builds to optimize the Docker build process.

Features of Docker Multi-Stage Builds

There are several features of Docker Multi-Stage builds that make it a powerful tool for building Docker images. These features include:

Reduced Image Size: By using multiple stages, Docker Multi-Stage builds can significantly reduce the size of the final Docker image. This is because each stage only includes the necessary files and dependencies for that particular stage, resulting in a smaller and more efficient image.
Improved Performance: Docker Multi-Stage builds can also improve the performance of Docker images. By reducing the number of layers in the image, Docker can more quickly load and run the image.
Simplified Build Process: Multi-Stage builds also simplify the build process by enabling developers to define each stage of the build process in a separate Dockerfile. This makes it easier to manage dependencies and keep the build process organized.
Customizable Build Process: Multi-Stage builds also enable developers to create a custom build process that is optimized for their application. Each stage can use a different base image and set of instructions, giving developers more control over the build process.

Creating a Multi-Stage Docker Image for a ToDo List Application

To demonstrate how to use Docker Multi-Stage builds, we will create a Docker image for a simple ToDo List application. The application is a Node.js application that uses Express.js and MongoDB to create a simple ToDo List.

Step 1: Define the First Stage

The first step in creating a Multi-Stage Docker image is to define the first stage. The first stage is responsible for installing the application dependencies and building the application. In this example, we will use the Node.js base image and copy the package.json and package-lock.json files to the container. We will then run the npm install command to install the dependencies and npm run build command to build the application.

Add the following code to a new file named Dockerfile in the root directory of the application:

# First Stage: Install Dependencies
FROM node:14-alpine AS base
WORKDIR /app
COPY package*.json ./
RUN npm install
COPY . .
RUN npm run build

The WORKDIR command specifies the working directory inside the container.
The COPY command copies the package.json and package-lock.json files to the container.
The RUN npm install command installs the dependencies.
The second COPY command copies the rest of the application files to the container.
The RUN npm run build command builds the application.

Note that we use the alpine version of the Node.js base image for the first stage, as it is smaller and more lightweight than the standard version.

Step 2: Define the Second Stage

The second step in creating a Multi-Stage Docker image is to define the second stage. The second stage is responsible for running the application. In this example, we will use a slim version of the Node.js base image for the second stage and copy the application files from the first stage.

Add the following code to the existing Dockerfile to define the second stage:

# Second Stage: Run Application
FROM node:14-slim AS production
WORKDIR /app
COPY --from=base /app .
EXPOSE 3000
CMD ["npm", "start"]

The –from=base flag in the COPY command tells Docker to copy the files from the first stage (named base) to the second stage.
The EXPOSE command specifies that the application will run on port 3000.
The CMD command tells Docker to run the npm start command when the container is started.

Here’s how the complete Dockerfile would look like after defining both stages:

# First Stage: Install Dependencies
FROM node:14-alpine AS base
WORKDIR /app
COPY package*.json ./
RUN npm install
COPY . .
RUN npm run build

# Second Stage: Run Application
FROM node:14-slim AS production
WORKDIR /app
COPY --from=base /app .
EXPOSE 3000
CMD ["npm", "start"]

This Dockerfile contains two stages: the first stage, named base, installs the dependencies and builds the application, while the second stage, named production, runs the application.

The FROM command is used to specify the base image for each stage. The AS keyword is used to name each stage. We then use the WORKDIR command to set the working directory inside each stage, and the COPY command to copy files from the host machine to the container.

The RUN command is used to run commands inside each stage. In the first stage, we install the dependencies and build the application, while in the second stage we do not need to run any additional commands.

Finally, the EXPOSE command is used to specify which port the application listens on, and the CMD command is used to specify the command that should be run when the container is started.

Step 3: Build the Docker Image

Now that we have defined both stages of the Docker image, we can build the image using the docker build command. Run the following command in the terminal from the root directory of the application:

docker build -t todo-app .

The -t flag specifies the name and tag for the Docker image. In this case, the image will be named todo-app.

Step 4: Run the Docker Container

After the Docker image is built, we can run the Docker container using the docker run command. Run the following command in the terminal to start the container:

docker run -p 3000:3000 todo-app

The -p flag maps port 3000 on the Docker container to port 3000 on the host machine.

Step 5: Verify the Application is Running

After the Docker container is started, we can verify that the application is running by visiting http://localhost:3000 in a web browser. If the application is running, we should see a simple ToDo List with the ability to add and remove items.

Visualising the Multi-Stage Build using Docker Extension

Docker Team has built an extension that visualises the Multi-stage build in much simpler way.

docker extension install eunomie/hack-docker-ide:v0.1.4

Just add your Dockerfile and visualize the Multi-stage build.

Give it a try!

Conclusion

Docker Multi-Stage builds are a powerful tool that can significantly improve the build time and efficiency of Docker images. By using multiple stages, developers can optimize the size and performance of Docker images and create a custom build process that is optimized for their application. While Multi-Stage builds have a learning curve and can be more complex than traditional Docker builds, the benefits are significant and can save developers time and resources in the long run.