Docker Compose Production Deployment: Health Checks, Restart Policies, and Resource Limits
At 3 AM, a server alert blasted me out of bed.
I opened my laptop to find all containers showing running status, with healthy green dots. But when I tried to access the service? 502 Bad Gateway.
The database container hadn’t finished starting yet, but the application container had already frantically tried to connect. Connection failed, service down. The container was still “running”, but the service was long dead.
That was my real experience the first time I deployed Docker Compose to production.
Getting it running and keeping it running are two completely different things. Your docker-compose up might start everything with a single command, but that doesn’t mean it can stand tall when memory explodes or processes crash in the middle of the night.
The three essentials of Docker Compose production deployment—health checks, restart policies, and resource limits—are what fill this gap. This article shares the configuration methods I’ve learned from trial and error, complete with copy-paste ready YAML templates to help you transform your containers from “barely running” to “rock solid.”
1. Health Checks: Determining If a Container Is Truly Ready
The running status shown by docker ps only tells you the container process exists. But can it actually serve requests? You don’t know.
Health checks are Docker’s way of giving your containers regular “physical exams”: sending HTTP requests, testing database connections, or running scripts to see if the service is genuinely alive.
How Health Checks Work
Docker sends check commands to your container at intervals you define. A return code of 0 means healthy; non-zero means unhealthy. After consecutive failures, the container gets marked as unhealthy.
Here’s the key: Health check failures don’t automatically trigger restarts. They just reveal the status, telling you “this one has issues.” To make it self-healing, you need to combine it with depends_on conditional startup and restart policies.
Four Parameters You Need to Understand
healthcheck:
test: ["CMD-SHELL", "curl -f http://localhost:3000/health || exit 1"]
interval: 30s # How often to check
timeout: 10s # Timeout for each check
retries: 3 # Consecutive failures to mark unhealthy
start_period: 60s # Grace period during startup (failures don't count)
I once ignored the start_period parameter. The result? My app started slowly, needed 40 seconds to connect to the database, but health checks started after 10 seconds. Three consecutive failures, immediately marked as unhealthy. After adding start_period: 60s, I gave the application enough initialization time.
Common Health Check Commands
Web Service:
healthcheck:
test: ["CMD-SHELL", "curl -f http://localhost:3000/health || exit 1"]
interval: 30s
timeout: 10s
retries: 3
start_period: 30s
PostgreSQL:
healthcheck:
test: ["CMD-SHELL", "pg_isready -U postgres"]
interval: 10s
timeout: 5s
retries: 5
Redis:
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 10s
timeout: 3s
retries: 3
Combining with depends_on for Dependency Startup
This is where health checks really shine: making dependent services wait until they’re truly ready.
services:
app:
depends_on:
db:
condition: service_healthy # Wait for DB health check to pass
redis:
condition: service_healthy # Wait for Redis health check to pass
I used to just use depends_on: [db, redis], but the app container would start while the database was still initializing. Connection failed, immediate error and exit. After switching to condition: service_healthy, the application patiently waits until the database can respond to pg_isready before starting. Peace at last.
2. Restart Policies: Giving Containers Self-Healing Ability
When a container crashes, who brings it back up?
Manual docker restart? Try that at 3 AM when you get an alert.
Restart policies let the Docker daemon handle this for you. After a container exits, Docker automatically decides whether to restart it.
Comparing Four Policies
| Policy | Behavior | Use Case |
|---|---|---|
no | It crashes, it stays crashed | Temporary testing, CI/CD |
always | Restart no matter how it exits | Core services |
on-failure | Only restart on abnormal exit | Task-based containers |
unless-stopped | Always restart unless manually stopped | Production favorite |
Production Choice: unless-stopped
restart: unless-stopped
Why recommend unless-stopped over always?
The difference lies in: behavior after manual docker stop.
always: After manual stop, if the system or Docker service restarts, the container automatically starts againunless-stopped: After manual stop, it stays stopped—won’t come back to life on its own
Imagine this: You manually stop a container for maintenance, then the server reboots and it starts running again. You might just want to say: what the hell.
Retry Limits with on-failure
on-failure can set retry limits:
restart: on-failure:5 # Restart at most 5 times
If a container fails to start 5 consecutive times, Docker gives up. Perfect for scenarios where repeated crashes might be caused by external issues (database unreachable, config errors)—preventing infinite restart loops.
How to Choose? Simple Guidelines
- Core services (Web, API, Database):
unless-stopped - Background tasks, scheduled scripts:
on-failure - Development debugging, temporary runs:
no
One pitfall: Restart policies only handle “should the container restart after exiting”. They don’t care about “is the service actually usable”. For that, you need health checks.
3. Resource Limits: Preventing Runaway Containers
Have you ever experienced this: one container has a memory leak, eats up all server memory, and the OOM Killer takes out all other containers.
I have. That feeling… well.
Resource limits set a “ceiling” for each container: exceed this limit, and it gets killed, protecting other services.
limits vs reservations
deploy:
resources:
limits:
cpus: '1.0' # Maximum 1 CPU
memory: 512M # Maximum 512MB memory
reservations:
cpus: '0.5' # Reserve at least 0.5 CPU
memory: 256M # Reserve at least 256MB memory
- limits: Hard limits—exceed and the process gets killed (OOM)
- reservations: Soft limits—tell the scheduler “this container needs at least this much”
To put it plainly: limits is “can’t exceed”, reservations is “at least guaranteed”.
How to Set CPU Limits?
cpus: '1.0' # Maximum 1 full CPU core
cpus: '0.5' # Maximum 50% CPU
cpus: '2.0' # Maximum 2 cores
CPU limits are soft—containers that exceed them get throttled, not killed. So err on the higher side.
How to Set Memory Limits?
memory: 512M # 512MB
memory: 2G # 2GB
Memory limits are hard limits. Exceed them, and the container gets killed by OOM Killer—no negotiation.
My empirical values:
- Node.js applications: At least
512M, production recommended1G - Python applications:
256M - 512M - PostgreSQL: Based on connections and data volume,
1G - 4G - Redis:
256M - 512M, larger if used for caching
A Practical Configuration
services:
app:
image: myapp:latest
deploy:
resources:
limits:
cpus: '1.0'
memory: 1G
reservations:
cpus: '0.25'
memory: 256M
This configuration means: the app can use at most 1 CPU and 1GB memory, but Docker guarantees at least 0.25 CPU and 256MB memory.
Note: deploy configuration is primarily for Docker Swarm. For single-machine deployment with docker-compose up, resource limits work but require Docker Compose V2 or the docker-compose --compatibility flag. A more universal single-machine approach is using mem_limit and cpus (deprecated) or directly using deploy (supported in Compose V2.20+).
4. Complete Template: Production-Ready YAML You Can Copy
Each element is useful on its own, but combined they create production-grade configuration. Here’s a complete example with Web service + PostgreSQL + Redis that you can copy and adapt.
Complete Example
version: '3.8'
services:
# Web Application
app:
image: myapp:latest
restart: unless-stopped
depends_on:
db:
condition: service_healthy
redis:
condition: service_healthy
healthcheck:
test: ["CMD-SHELL", "curl -f http://localhost:3000/health || exit 1"]
interval: 30s
timeout: 10s
retries: 3
start_period: 60s
deploy:
resources:
limits:
cpus: '1.0'
memory: 1G
reservations:
cpus: '0.25'
memory: 256M
logging:
driver: json-file
options:
max-size: "10m" # Max 10MB per log file
max-file: "3" # Keep at most 3 log files
# PostgreSQL Database
db:
image: postgres:15
restart: unless-stopped
environment:
POSTGRES_USER: appuser
POSTGRES_PASSWORD: apppassword
POSTGRES_DB: appdb
healthcheck:
test: ["CMD-SHELL", "pg_isready -U postgres"]
interval: 10s
timeout: 5s
retries: 5
start_period: 30s
deploy:
resources:
limits:
cpus: '2.0'
memory: 2G
reservations:
cpus: '0.5'
memory: 512M
volumes:
- pgdata:/var/lib/postgresql/data
# Redis Cache
redis:
image: redis:7-alpine
restart: unless-stopped
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 10s
timeout: 3s
retries: 3
deploy:
resources:
limits:
cpus: '0.5'
memory: 512M
reservations:
memory: 128M
volumes:
pgdata:
Configuration Separation Technique
Development and production environments usually need different configurations. Manage them with two separate files:
# compose.yaml - Development
version: '3.8'
services:
app:
build: .
ports:
- "3000:3000"
restart: "no" # Don't auto-restart during development
# compose.production.yaml - Production Override
version: '3.8'
services:
app:
image: myapp:v1.2.3 # Use pre-built image in production
restart: unless-stopped
healthcheck:
test: ["CMD-SHELL", "curl -f http://localhost:3000/health || exit 1"]
interval: 30s
timeout: 10s
retries: 3
deploy:
resources:
limits:
memory: 1G
Start production environment:
docker-compose -f compose.yaml -f compose.production.yaml up -d
The two files merge, with compose.production.yaml overriding compose.yaml.
Log Management: Preventing Disk Full
Docker’s default log driver is json-file, and logs grow indefinitely. Without limits, your disk gets eaten by logs after a few months.
logging:
driver: json-file
options:
max-size: "10m" # Max 10MB per file
max-file: "3" # Max 3 files, total 30MB max
Each container gets max 30MB of logs with automatic rotation. I add this to every service now—saves me from manual cleanup later.
Summary
After all that, the core logic of the three essentials is:
Health checks detect problems → Restart policies auto-recover → Resource limits contain failures
With this combination, your Docker Compose application can:
- Wait for dependencies to be truly ready at startup, instead of blindly rushing in
- Get back up on its own after crashing—you don’t need to wake up at 3 AM
- Prevent one runaway container from taking down the entire server
Now go check your docker-compose.yml. What’s missing? Add it.
If you haven’t started using Docker Compose for production deployment yet, bring these three configurations next time. You’ll thank yourself.
Configure Docker Compose Production Deployment Essentials
Add health checks, restart policies, and resource limits to Docker Compose for production-grade stable deployment
⏱️ Estimated time: 15 min
- 1
Step1: Add health check configuration
Add healthcheck configuration for each service:
• test: Check command (curl, pg_isready, redis-cli ping, etc.)
• interval: Check interval (recommended 10-30s)
• timeout: Timeout duration (recommended 5-10s)
• retries: Number of failures (recommended 3-5)
• start_period: Startup grace period (set 30-60s based on app startup time) - 2
Step2: Configure dependency conditional startup
Use the condition parameter of depends_on:
• Change depends_on: [db] to depends_on: db: condition: service_healthy
• Ensure dependent services have health checks configured
• Application will wait for dependencies to be truly available before starting - 3
Step3: Set restart policy
Choose restart policy based on service type:
• Core services (Web/API/Database): restart: unless-stopped
• Background tasks/scheduled scripts: restart: on-failure:5
• Development debugging: restart: "no"
• Avoid using always (may unexpectedly restart after manual stop) - 4
Step4: Configure resource limits
Set limits and reservations in deploy.resources:
• limits: Hard limits, will be killed by OOM Killer if exceeded
• reservations: Soft limits, minimum resources guaranteed by Docker
• Node.js applications: limits.memory recommended at least 512M
• Databases: Set 1G-4G based on connections and data volume - 5
Step5: Add log rotation configuration
Prevent log files from filling up disk:
• logging.driver: json-file (default driver)
• logging.options.max-size: "10m" (max 10MB per file)
• logging.options.max-file: "3" (keep 3 files)
• Total log cap 30MB, automatic rotation
FAQ
Will a container automatically restart after health check failure?
What's the difference between unless-stopped and always?
What's the difference between limits and reservations in resource limits?
What does the start_period parameter do?
How to use different configurations for development and production?
• compose.yaml: Development config (build: ., restart: "no")
• compose.production.yaml: Production override (image: xxx, restart: unless-stopped)
• Start command: docker-compose -f compose.yaml -f compose.production.yaml up -d
• The latter overrides the former's configuration, achieving configuration separation
Can I use deploy.resources for single-machine deployment?
8 min read · Published on: Apr 24, 2026 · Modified on: Apr 25, 2026
Docker Practice Guide
If you landed here from search, the fastest way to build context is to jump to the previous or next post in this same series.
Previous
Docker Compose Multi-Service Orchestration: One-Command Local Development Setup
Use Docker Compose to orchestrate multiple services—launch Web, API, MySQL, and Redis with a single command. Eliminate manual installation hassles, version conflicts, and port conflicts. New team members can start developing in 5 minutes after cloning the repo. Switching projects takes seconds.
Part 6 of 33
Next
Dockerfile Optimization: 5 Techniques to Reduce Image Size by 80%
Docker images ballooning to several GBs? Master 5 techniques - Alpine base images, merging RUN instructions, multi-stage builds, .dockerignore configuration, and cache cleanup - to shrink images from 1.2GB to 180MB, a 85% reduction. Complete Node.js optimization case study with real-world benchmarks included.
Part 8 of 33
Related Posts
Dockerfile Tutorial for Beginners: Build Your First Docker Image from Scratch
Dockerfile Tutorial for Beginners: Build Your First Docker Image from Scratch
Docker vs Virtual Machines: A 5-Minute Guide to Performance Differences and When to Use Each
Docker vs Virtual Machines: A 5-Minute Guide to Performance Differences and When to Use Each
Docker Installation Guide 2025: Complete Solutions from Permission Denied to Success

Comments
Sign in with GitHub to leave a comment