Switch Language
Toggle Theme

Docker Image Optimization in Action: Slimming Down from 1GB to 100MB

Monday morning, 9 AM. The CI/CD pipeline’s red light is flashing. I’m staring at a build that’s been running for 8 minutes. This deployment only changed a few lines of code, but the image upload alone takes 5 minutes—because the image is a whopping 1.2GB.

My boss messages in the group chat: “Why isn’t it live yet?”

I screenshot the build status and send it: Image uploading, 23% progress.

In that moment, I made a decision: This has to stop.

Later, I applied a series of optimizations to this Node.js application’s Docker image. The result? 98MB. Build time dropped from 8 minutes to 2 minutes. That’s a 10x improvement.

In this article, I’ll share these hands-on experiences with you. No fluff—every step has data comparisons, every code snippet is ready to use.

Why Is Your Image So Large?

Honestly, when I first ran docker history on that 1.2GB image, I was a bit confused.

docker history my-app:latest

The output looked something like this:

IMAGE          CREATED        SIZE
abc123def456   2 hours ago    850MB  # npm install artifacts
def456abc123   2 hours ago    180MB  # base image node:18
...

850MB just in the npm install layer. I paused for a moment—how did this get so huge?

Four Culprits Behind Image Bloat

Problem #1: Wrong base image choice.

My Dockerfile started with:

FROM node:18

The node:18 image is based on Debian and comes with a lot of things I don’t need: package managers, system utilities, development libraries. The base image alone is 900MB.

Check out this comparison of official images:

ImageSize
node:18~900MB
node:18-slim~230MB
node:18-alpine~170MB
alpine:3.18~5.5MB
distroless/static~2MB

See that? Just switching to node:18-alpine saves 730MB.

Problem #2: Build tools left behind.

I installed gcc, make, and python in the image because some npm packages need to be compiled. But here’s the thing—after compilation, I just left them there, eating up space in the image.

Problem #3: Layer stacking effect.

Each RUN instruction creates a new layer. My Dockerfile had a dozen RUN commands, and each layer carries all the files from previous layers. Deleted files are still there—they’re just “covered up.”

Problem #4: Poor cache optimization.

I put COPY . . at the very beginning, which means—every time I change one line of code, the entire image has to be rebuilt. npm install runs every time, downloading all dependencies again.

See It Clearly with dive

docker history alone isn’t intuitive enough. I recommend a tool called dive that can “peel apart” each layer of your image.

Installation is simple:

# macOS
brew install dive

# Linux
wget https://github.com/wagoodman/dive/releases/download/v0.12.0/dive_0.12.0_linux_amd64.deb
sudo dpkg -i dive_0.12.0_linux_amd64.deb

Then analyze your image:

dive my-app:latest

You’ll see an interactive interface with layers on the left and file changes on the right. Use arrow keys to navigate layers and clearly see which files were added, deleted, or modified in each layer.

The first time I used it, I discovered: node_modules appeared twice—once in /app/node_modules and once in a build stage temporary directory. How much space was that wasting?

5-Step Optimization Framework

Alright, problems identified. Now for solutions.

I’ve organized these methods into a 5-step framework, each with specific results data.

Step 1: Choose a Lightweight Base Image

This is the easiest step—change one line of code and see results.

# Before
FROM node:18

# After
FROM node:18-alpine

The effect? 900MB → 170MB. That’s 730MB saved, just by switching the base image.

How to choose between three lightweight options?

Image TypeUse CasePros & Cons
AlpineGeneral purposeSmall, rich package manager, but musl may have compatibility issues
DistrolessSecurity-firstMinimal, no shell, but difficult to debug
ScratchStatically compiled languages (Go, Rust)Smallest (0MB), requires static compilation

In most cases, Alpine is a solid choice. If you’re using the official Node.js -alpine images, glibc compatibility issues are largely resolved.

Step 2: Use Multi-Stage Builds

This is a game-changer. The principle is simple: separate the build environment from the runtime environment, and the final image only keeps what’s needed to run.

# Build stage
FROM node:18 AS builder
WORKDIR /app
COPY package*.json ./
RUN npm install
COPY . .
RUN npm run build

# Runtime stage
FROM node:18-alpine
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
CMD ["node", "dist/main.js"]

The key is the COPY --from=builder line—it only “steals” the files you need from the builder stage.

My 1.2GB image, after adding multi-stage builds, dropped to 200MB. The build stage’s 850MB of node_modules and various tools were all discarded.

Step 3: Optimize Layer Caching

Docker’s layer caching works like this: if a layer hasn’t changed, use the cache.

The problem was, my Dockerfile had the wrong order:

# Wrong approach
COPY . .                    # Copy all files first
RUN npm install             # Then install dependencies

Written this way: change one line of code → COPY . . changes → cache invalidates → npm install runs again.

The correct approach:

# Correct approach
COPY package*.json ./       # Copy dependency files first
RUN npm install             # Install dependencies (cache reused if deps unchanged)
COPY . .                    # Copy source code last (source changes often, put it last)

After this change, as long as package.json doesn’t change, npm install uses the cache directly. Build time dropped from 3 minutes to 30 seconds.

Another tip: merge RUN instructions.

# Before (4 layers)
RUN apt-get update
RUN apt-get install -y curl
RUN apt-get clean
RUN rm -rf /var/lib/apt/lists/*

# After (1 layer)
RUN apt-get update && \
    apt-get install -y --no-install-recommends curl && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/*

Why? Because each RUN is a layer. Files you “delete” are still in previous layers. Merging into one instruction prevents intermediate files from entering the image.

Step 4: Configure .dockerignore

This step is often overlooked. When you run docker build, Docker packages the entire directory and sends it to the daemon. If you have 500MB of node_modules locally, it all gets packaged up.

Create .dockerignore in your project root:

.git
node_modules
*.log
.env
docker-compose.yml
README.md
.vscode
tests
coverage

The effect? Build context drops from 500MB to 50MB. The docker build command runs much faster.

Step 5: Clean Up Unnecessary Files

If you must install packages with apt, remember to clean up:

RUN apt-get update && \
    apt-get install -y --no-install-recommends \
        curl \
        ca-certificates && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*

Key points:

  • --no-install-recommends skips “recommended” packages, saving significant space
  • apt-get clean clears the apt cache
  • rm -rf /var/lib/apt/lists/* deletes package lists

After this step, you save another 50-100MB.

Real-World Cases: Complete Optimization for 3 Languages

Talk is cheap. I’ve prepared complete Dockerfiles for three languages—copy and run them directly.

Node.js Application

Starting point: node:18 base image, 900MB

# Optimized complete Dockerfile
FROM node:18-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY . .
RUN npm run build

FROM node:18-alpine
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
USER node
CMD ["node", "dist/main.js"]

Optimization path:

  1. Switch to alpine: 900MB → 170MB
  2. Multi-stage build: 170MB → 120MB
  3. Production-only dependencies (npm ci --only=production): 120MB → 98MB

Go Application

Go is a statically compiled language, naturally suited for Docker optimization.

# Build stage
FROM golang:1.22-alpine AS builder
WORKDIR /app
COPY go.* ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -o main .

# Runtime stage (using scratch empty image)
FROM scratch
COPY --from=builder /app/main /main
ENTRYPOINT ["/main"]

Result: 800MB → under 10MB.

You read that right. scratch is an empty image containing only your compiled binary. Go’s static compilation doesn’t depend on any dynamic libraries—it just runs.

Python Application

Python is a bit more complex because Alpine uses musl instead of glibc, which can cause issues with some packages.

FROM python:3.11-alpine AS builder
WORKDIR /app
COPY requirements.txt .
RUN pip install --user -r requirements.txt
COPY . .

FROM python:3.11-alpine
WORKDIR /app
COPY --from=builder /root/.local /root/.local
COPY --from=builder /app .
ENV PATH=/root/.local/bin:$PATH
CMD ["python", "main.py"]

Important notes:

  • If pip install fails, you may need to install musl-dev and gcc
  • Some scientific computing packages (numpy, pandas) may have performance issues on Alpine
  • For stability, consider python:3.11-slim (Debian-based)

Common Pitfalls and Solutions

I’ve stepped into plenty of traps during optimization. Let me warn you ahead of time.

Pitfall 1: Alpine’s musl Compatibility Issues

Symptom: An npm package fails to install, reporting missing glibc.

Cause: Alpine uses musl libc, not standard glibc. Some packages depend on glibc.

Solutions:

  • For Node.js: Use official node:18-alpine images, most issues are already handled
  • For Python: If installation fails, use debian:slim
  • If you must use Alpine: Try the apk add gcompat compatibility layer

Pitfall 2: Scratch Images Can’t Be Debugged

Symptom: docker exec -it container sh fails because scratch has no shell.

Solutions:

  • Use scratch for production, alpine for debugging stages
  • Or build a separate debug image:
    FROM alpine
    COPY --from=production /app /app
    CMD ["sh"]

Pitfall 3: CI/CD Cache Loss

Symptom: Local builds are fast, but CI starts from scratch every time.

Cause: CI environments don’t preserve Docker cache.

Solution: Use BuildKit’s cache mount:

# syntax=docker/dockerfile:1
FROM node:18-alpine
WORKDIR /app
COPY package*.json ./
RUN --mount=type=cache,target=/root/.npm \
    npm install

This way npm cache persists, even in CI.

Pitfall 4: Over-aggressive .dockerignore Exclusion

Symptom: Build fails, complaining about missing files.

Cause: .dockerignore excluded files you actually need.

Solutions:

  • Progressive exclusion—start with obvious ones (node_modules, .git)
  • Use docker build --no-cache to verify clean builds
  • For exceptions, use !:
    tests
    !tests/fixtures

Verification and Continuous Improvement

Optimization done—how do you verify the results?

Verification Methods

1. Check image size

docker images my-app

2. View layer details

docker history my-app:latest --no-trunc

3. Visual analysis

dive my-app:latest

Performance Trade-offs

Smaller images—but will there be runtime issues?

Alpine’s musl vs glibc:

  • musl is lighter, but slightly slower in some scenarios
  • If your application makes heavy system calls, benchmark and compare

Scratch’s security:

  • Minimal attack surface, most secure
  • But when problems arise, you can’t enter the container to troubleshoot

My recommendation: From a security perspective, use scratch whenever possible. Switch to alpine for debugging. In CI, build both images—the one with -debug suffix is the debug version.

Continuous Improvement Suggestions

  1. Regularly update base images: Check once a month—security fixes are frequent
  2. Monitor image size: Add a check in CI, alert if over 100MB
  3. Use hadolint: Dockerfile static analysis tool, catches issues early
    docker run --rm -i hadolint/hadolint < Dockerfile
ToolPurposeInstallation
diveImage analysisbrew install dive
hadolintDockerfile lintingbrew install hadolint
docker-slimAuto-slimmingbrew install docker-slim

docker-slim is interesting—it analyzes which files your image actually uses at runtime, then deletes everything else. But try it in a test environment first—don’t break production.

Docker Image Optimization 5-Step Method

Complete optimization process from analysis to verification, helping developers reduce Docker images from 1GB to 100MB

⏱️ Estimated time: 30 min

  1. 1

    Step1: Choose lightweight base image

    Switch base image from full version to slim version:

    • node:18 → node:18-alpine (900MB → 170MB)
    • golang:1.22 → golang:1.22-alpine
    • python:3.11 → python:3.11-alpine

    Note: Alpine uses musl libc, packages depending on glibc may need extra handling
  2. 2

    Step2: Use multi-stage builds

    Separate build and runtime environments:

    • Build stage: Install compilation tools, build dependencies
    • Runtime stage: Copy only build artifacts and runtime dependencies
    • Use FROM ... AS builder to define build stage
    • Use COPY --from=builder to copy build artifacts
  3. 3

    Step3: Optimize layer cache order

    Adjust Dockerfile instruction order to maximize cache reuse:

    • First copy dependency files (package.json, requirements.txt)
    • Then install dependencies (npm install, pip install)
    • Finally copy source code (COPY . .)

    Merge multiple RUN instructions to avoid intermediate file residue
  4. 4

    Step4: Configure .dockerignore

    Create .dockerignore file in project root to exclude unnecessary files:

    • .git, node_modules, *.log
    • .env, .vscode, tests
    • docker-compose.yml, README.md

    This greatly reduces build context and improves build speed
  5. 5

    Step5: Clean up unnecessary files

    Clean caches and temporary files in RUN instructions:

    • Use --no-install-recommends to avoid installing recommended packages
    • Use apt-get clean to clean package cache
    • Use rm -rf /var/lib/apt/lists/* to delete package lists
    • Clean /tmp/* and /var/tmp/*

Summary

After all that, the core is just 5 steps:

  1. Choose the right base image: Use alpine when possible, use scratch for static compilation
  2. Multi-stage builds: Separate build from runtime, keep only what’s needed
  3. Optimize layer caching: Dependency files first, source code last
  4. Configure .dockerignore: Exclude unnecessary files
  5. Clean up residual files: Delete apt caches, temporary files

Go check your Docker images now. Use docker history to see which layer takes the most space, then try the methods in this article.

See you in the comments—tell me your optimization results: from how many MB down to how many MB?

FAQ

What's the difference between Alpine and Debian base images?
Alpine is based on musl libc and busybox, with a size of only 5.5MB; Debian is based on standard glibc, around 80MB. Alpine is smaller but may have compatibility issues; Debian is more stable but larger.
Does multi-stage build affect build speed?
First build is slightly slower (needs to build two stages), but final image size is greatly reduced. Subsequent builds with caching will match or even exceed single-stage build speed.
What scenarios is Scratch image suitable for?
Scratch is an empty image, suitable for statically compiled languages (Go, Rust). Use it when pursuing ultimate security and size in production. Switch to alpine for debugging or build a separate debug image.
How to solve Alpine's glibc compatibility issues?
Three solutions: 1) Use official -alpine images (mostly handled); 2) Install gcompat compatibility layer (apk add gcompat); 3) Switch to debian:slim instead.
How to preserve Docker cache in CI/CD environments?
Use BuildKit's cache mount feature: RUN --mount=type=cache,target=/root/.npm npm install. This allows CI environments to reuse dependency cache, improving build speed 5-10x.

9 min read · Published on: Mar 20, 2026 · Modified on: Mar 20, 2026

Comments

Sign in with GitHub to leave a comment

Related Posts