Join our Discord Server
Adesoji Alu Adesoji brings a proven ability to apply machine learning(ML) and data science techniques to solve real-world problems. He has experience working with a variety of cloud platforms, including AWS, Azure, and Google Cloud Platform. He has a strong skills in software engineering, data science, and machine learning. He is passionate about using technology to make a positive impact on the world.

How I Reduced a Docker Image Size by 90%: A Step-by-Step Journey

2 min read

How I Reduced a Docker Image Size by 90%: A Step-by-Step Journey

Let me take you through my journey of optimizing a Python-based Machine Learning application’s Docker image, reducing it from a hefty 3.09GB to just 280MB. Here’s how I did it, step by step.

The Initial Problem

I started with a typical Dockerfile for a machine learning application that uses TensorFlow:


FROM python:3.9
WORKDIR /app
COPY . .
RUN pip install -r requirements.txt
CMD ["python", "app.py"]

With these requirements:


tensorflow
pandas
numpy
scikit-learn
pillow
flask
gunicorn

Initial image size: 3.09GB 😱

original

Step 1: Analyzing the Base Image

First, I used docker history to understand what was taking up space:


docker history image_name

Key findings:

  • Base python:3.9 image: 934MB
  • TensorFlow and its dependencies: 1.7GB
  • Our application code: ~200MB

Step 2: Switching to a Slim Base Image

Changed from python:3.9 to python:3.9-slim:


FROM python:3.9-slim
WORKDIR /app
COPY . .
RUN pip install -r requirements.txt
CMD ["python", "app.py"]

New size: 1.9GB (Reduction: 32%)

Step 3: Optimizing Dependencies

I noticed we didn’t need the full TensorFlow package. Switched to TensorFlow Lite:


# requirements.txt
tflite-runtime
pandas
numpy
scikit-learn
pillow
flask
gunicorn

New size: 1.2GB (Reduction: 57%)

Step 4: Multi-stage Build

Implemented a multi-stage build to separate build dependencies from runtime:


# Build stage
FROM python:3.9-slim as builder

WORKDIR /app
COPY requirements.txt .

RUN pip install --user -r requirements.txt

# Runtime stage
FROM python:3.9-slim

WORKDIR /app

# Copy only the necessary files from builder
COPY --from=builder /root/.local/lib/python3.9/site-packages /root/.local/lib/python3.9/site-packages
COPY app.py .
COPY models /app/models

ENV PATH=/root/.local/bin:$PATH

CMD ["python", "app.py"]

New size: 850MB (Reduction: 70%)

Step 5: Cleaning Up Package Manager

Added cleanup commands and combined RUN statements:


# Stage 1: Builder
FROM python:3.9-slim AS builder

WORKDIR /app
COPY requirements.txt .

# Install dependencies and remove unnecessary files to reduce image size
RUN pip install --user --no-cache-dir -r requirements.txt && \
    find /root/.local \
        \( -type d -name test -o -name tests \) -prune -o \
        \( -type f -name '*.pyc' -o -name '*.pyo' \) -exec rm -rf '{}' +

# Stage 2: Final Image
FROM python:3.9-slim

WORKDIR /app

# Copy only necessary runtime dependencies
COPY --from=builder /root/.local /root/.local
COPY app.py .

# Install required system dependencies and clean up
RUN apt-get update && \
    apt-get install --no-install-recommends -y libgomp1 && \
    rm -rf /var/lib/apt/lists/*

# Ensure local Python packages are available
ENV PATH="/root/.local/bin:$PATH"

CMD ["python", "app.py"]

New size: 280MB (Total reduction: 90%)

Final Results

Let’s look at the progression:

  • Initial image: 3.09GB
  • Slim base image: 1.9GB
  • Optimized dependencies: 1.2GB
  • Multi-stage build: 850MB
  • Final optimized image: 280MB

Key Takeaways

  • Analyze First: Always use docker history to understand what’s consuming space.
  • Choose the Right Base: Slim variants can significantly reduce size without sacrificing functionality.
  • Optimize Dependencies:
    • Use lighter alternatives when possible
    • Only install what you need
    • Consider using wheels for Python packages
  • Multi-stage Builds: Separate build-time dependencies from runtime needs.
  • Clean Up: Remove unnecessary files and cache after installations.

Bonus Tips

  • Layer Caching: Keep frequently changing files in later layers.
  • .dockerignore: Exclude unnecessary files from the build context.
  • Use BuildKit: Enable Docker BuildKit for more efficient builds:

export DOCKER_BUILDKIT=1

Measuring Impact

Beyond just size reduction, this optimization brought several benefits:

  • 75% faster deployment times
  • Reduced bandwidth costs
  • Improved security with smaller attack surface
  • Faster container startup times

Remember, the goal isn’t just to make images smaller – it’s to find the right balance between size and functionality for your specific use case.

Have Queries? Join https://launchpass.com/collabnix

Adesoji Alu Adesoji brings a proven ability to apply machine learning(ML) and data science techniques to solve real-world problems. He has experience working with a variety of cloud platforms, including AWS, Azure, and Google Cloud Platform. He has a strong skills in software engineering, data science, and machine learning. He is passionate about using technology to make a positive impact on the world.

How to Build and Host Your Own MCP Servers…

Introduction The Model Context Protocol (MCP) is revolutionizing how LLMs interact with external data sources and tools. Think of MCP as the “USB-C for...
Adesoji Alu
1 min read
Join our Discord Server
Index