How I Reduced a Docker Image Size by 90%: A Step-by-Step Journey
Let me take you through my journey of optimizing a Python-based Machine Learning application’s Docker image, reducing it from a hefty 3.09GB to just 280MB. Here’s how I did it, step by step.
The Initial Problem
I started with a typical Dockerfile for a machine learning application that uses TensorFlow:
FROM python:3.9
WORKDIR /app
COPY . .
RUN pip install -r requirements.txt
CMD ["python", "app.py"]
With these requirements:
tensorflow
pandas
numpy
scikit-learn
pillow
flask
gunicorn
Initial image size: 3.09GB 😱
![original](https://i0.wp.com/collabnix.com/wp-content/uploads/2025/02/Screenshot-2025-02-10-at-8.46.52-PM.png?ssl=1)
Step 1: Analyzing the Base Image
First, I used docker history
to understand what was taking up space:
docker history image_name
Key findings:
- Base
python:3.9
image: 934MB - TensorFlow and its dependencies: 1.7GB
- Our application code: ~200MB
Step 2: Switching to a Slim Base Image
Changed from python:3.9
to python:3.9-slim
:
FROM python:3.9-slim
WORKDIR /app
COPY . .
RUN pip install -r requirements.txt
CMD ["python", "app.py"]
New size: 1.9GB (Reduction: 32%)
Step 3: Optimizing Dependencies
I noticed we didn’t need the full TensorFlow package. Switched to TensorFlow Lite:
# requirements.txt
tflite-runtime
pandas
numpy
scikit-learn
pillow
flask
gunicorn
New size: 1.2GB (Reduction: 57%)
Step 4: Multi-stage Build
Implemented a multi-stage build to separate build dependencies from runtime:
# Build stage
FROM python:3.9-slim as builder
WORKDIR /app
COPY requirements.txt .
RUN pip install --user -r requirements.txt
# Runtime stage
FROM python:3.9-slim
WORKDIR /app
# Copy only the necessary files from builder
COPY --from=builder /root/.local/lib/python3.9/site-packages /root/.local/lib/python3.9/site-packages
COPY app.py .
COPY models /app/models
ENV PATH=/root/.local/bin:$PATH
CMD ["python", "app.py"]
New size: 850MB (Reduction: 70%)
Step 5: Cleaning Up Package Manager
Added cleanup commands and combined RUN
statements:
# Stage 1: Builder
FROM python:3.9-slim AS builder
WORKDIR /app
COPY requirements.txt .
# Install dependencies and remove unnecessary files to reduce image size
RUN pip install --user --no-cache-dir -r requirements.txt && \
find /root/.local \
\( -type d -name test -o -name tests \) -prune -o \
\( -type f -name '*.pyc' -o -name '*.pyo' \) -exec rm -rf '{}' +
# Stage 2: Final Image
FROM python:3.9-slim
WORKDIR /app
# Copy only necessary runtime dependencies
COPY --from=builder /root/.local /root/.local
COPY app.py .
# Install required system dependencies and clean up
RUN apt-get update && \
apt-get install --no-install-recommends -y libgomp1 && \
rm -rf /var/lib/apt/lists/*
# Ensure local Python packages are available
ENV PATH="/root/.local/bin:$PATH"
CMD ["python", "app.py"]
New size: 280MB (Total reduction: 90%)
Final Results
Let’s look at the progression:
- Initial image: 3.09GB
- Slim base image: 1.9GB
- Optimized dependencies: 1.2GB
- Multi-stage build: 850MB
- Final optimized image: 280MB
Key Takeaways
- Analyze First: Always use
docker history
to understand what’s consuming space. - Choose the Right Base: Slim variants can significantly reduce size without sacrificing functionality.
- Optimize Dependencies:
- Use lighter alternatives when possible
- Only install what you need
- Consider using wheels for Python packages
- Multi-stage Builds: Separate build-time dependencies from runtime needs.
- Clean Up: Remove unnecessary files and cache after installations.
Bonus Tips
- Layer Caching: Keep frequently changing files in later layers.
- .dockerignore: Exclude unnecessary files from the build context.
- Use BuildKit: Enable Docker BuildKit for more efficient builds:
export DOCKER_BUILDKIT=1
Measuring Impact
Beyond just size reduction, this optimization brought several benefits:
- 75% faster deployment times
- Reduced bandwidth costs
- Improved security with smaller attack surface
- Faster container startup times
Remember, the goal isn’t just to make images smaller – it’s to find the right balance between size and functionality for your specific use case.