DevOps & Containers
Docker in Production: Multi-Stage Builds, Layer Caching, and Security Hardening
14 min
DockerContainersDevOpsProductionSecurity
Production-ready Docker practices including multi-stage builds for smaller images, layer caching strategies, security scanning, and docker-compose orchestration patterns.
Running Docker in production requires more than docker run. This guide covers optimization techniques, security hardening, and patterns I use for production containerized applications.
Multi-Stage Builds: Smaller, Faster Images
1# ❌ Bad: Single-stage build (800MB+)
2FROM node:18
3WORKDIR /app
4COPY package*.json ./
5RUN npm install
6COPY . .
7RUN npm run build
8CMD ["node", "dist/server.js"]
9
10# ✅ Good: Multi-stage build (150MB)
11# Stage 1: Dependencies
12FROM node:18-alpine AS deps
13WORKDIR /app
14COPY package*.json ./
15RUN npm ci --only=production
16
17# Stage 2: Build
18FROM node:18-alpine AS builder
19WORKDIR /app
20COPY package*.json ./
21RUN npm ci
22COPY . .
23RUN npm run build
24
25# Stage 3: Production runtime
26FROM node:18-alpine AS runner
27WORKDIR /app
28ENV NODE_ENV=production
29
30# Copy only production dependencies from deps stage
31COPY /app/node_modules ./node_modules
32# Copy only built artifacts from builder stage
33COPY /app/dist ./dist
34COPY /app/public ./public
35
36# Create non-root user
37RUN addgroup -g 1001 -S nodejs
38RUN adduser -S nextjs -u 1001
39USER nextjs
40
41EXPOSE 3000
42CMD ["node", "dist/server.js"]
43
44# Result: 150MB vs 800MB (5x smaller)
45# Faster downloads, less attack surface, lower costs
Layer Caching for Fast Builds
1# ❌ Bad: Bust cache on any file change
2FROM python:3.11-slim
3WORKDIR /app
4COPY . . # This invalidates cache when ANY file changes
5RUN pip install -r requirements.txt
6CMD ["python", "app.py"]
7
8# ✅ Good: Optimize layer order
9FROM python:3.11-slim
10
11WORKDIR /app
12
13# Install system dependencies (rarely change)
14RUN apt-get update && apt-get install -y \
15 gcc \
16 && rm -rf /var/lib/apt/lists/*
17
18# Copy requirements first (only cache-bust when deps change)
19COPY requirements.txt .
20RUN pip install --no-cache-dir -r requirements.txt
21
22# Copy app code last (changes frequently)
23COPY . .
24
25CMD ["python", "app.py"]
26
27# Build time with cache:
28# First build: 3 minutes
29# Rebuild after code change: 5 seconds (reuses dependency layers)
Security Hardening Checklist
1# Production-grade secure Dockerfile
2FROM node:18-alpine
3
4# 1. Run as non-root user
5RUN addgroup -S appgroup && adduser -S appuser -G appgroup
6
7# 2. Install only what's needed
8WORKDIR /app
9COPY package*.json ./
10RUN npm ci --only=production --ignore-scripts
11
12# 3. Copy application files
13COPY . .
14
15# 4. Remove unnecessary files
16RUN rm -rf tests/ docs/ .git/
17
18# 5. Set environment variables
19ENV NODE_ENV=production \
20 PORT=3000
21
22# 6. Use specific port
23EXPOSE 3000
24
25# 7. Switch to non-root user
26USER appuser
27
28# 8. Add health check
29HEALTHCHECK \
30 CMD node healthcheck.js || exit 1
31
32# 9. Use exec form of CMD
33CMD ["node", "server.js"]
34
35# Security scanning:
36# docker scan my-image
37# trivy image my-image
38# snyk container test my-image
Docker Compose for Multi-Container Apps
1# docker-compose.yml - Production-ready setup
2version: '3.8'
3
4services:
5 app:
6 build:
7 context: .
8 dockerfile: Dockerfile
9 target: production
10 ports:
11 - "3000:3000"
12 environment:
13 DATABASE_URL: postgres://postgres:password@db:5432/myapp
14 REDIS_URL: redis://cache:6379
15 NODE_ENV: production
16 depends_on:
17 db:
18 condition: service_healthy
19 cache:
20 condition: service_started
21 restart: unless-stopped
22 networks:
23 - app-network
24 volumes:
25 - ./logs:/app/logs # Persist logs
26 healthcheck:
27 test: ["CMD", "curl", "-f", "http://localhost:3000/health"]
28 interval: 30s
29 timeout: 10s
30 retries: 3
31 start_period: 40s
32
33 db:
34 image: postgres:15-alpine
35 environment:
36 POSTGRES_DB: myapp
37 POSTGRES_USER: postgres
38 POSTGRES_PASSWORD: ${DB_PASSWORD} # Use .env file
39 volumes:
40 - postgres-data:/var/lib/postgresql/data
41 - ./init.sql:/docker-entrypoint-initdb.d/init.sql
42 ports:
43 - "5432:5432"
44 restart: unless-stopped
45 networks:
46 - app-network
47 healthcheck:
48 test: ["CMD-SHELL", "pg_isready -U postgres"]
49 interval: 10s
50 timeout: 5s
51 retries: 5
52
53 cache:
54 image: redis:7-alpine
55 ports:
56 - "6379:6379"
57 restart: unless-stopped
58 networks:
59 - app-network
60 volumes:
61 - redis-data:/data
62 command: redis-server --appendonly yes
63
64 nginx:
65 image: nginx:alpine
66 ports:
67 - "80:80"
68 - "443:443"
69 volumes:
70 - ./nginx.conf:/etc/nginx/nginx.conf:ro
71 - ./ssl:/etc/nginx/ssl:ro
72 depends_on:
73 - app
74 restart: unless-stopped
75 networks:
76 - app-network
77
78volumes:
79 postgres-data:
80 redis-data:
81
82networks:
83 app-network:
84 driver: bridge
85
86# Usage:
87# docker-compose up -d # Start all services
88# docker-compose ps # Check status
89# docker-compose logs -f app # View app logs
90# docker-compose down -v # Stop and remove volumes
Pro Tips and Gotchas
- Use .dockerignore to exclude node_modules, .git, tests from build context (faster builds)
- Pin base image versions: node:18.15-alpine not node:latest (reproducible builds)
- BuildKit enables parallel builds: DOCKER_BUILDKIT=1 docker build . (2-3x faster)
- Layer sizes shown with 'docker history <image>' - find bloat sources
- Bind mounts for development: docker run -v $(pwd):/app for hot reload
- Named volumes persist data: docker volume create prevents data loss
- Container logs: docker logs <container> --follow --tail 100
- Exec into running container: docker exec -it <container> sh
- Resource limits: --memory=512m --cpus=0.5 prevents resource hogging
- Security: Never run containers as root in production, scan images regularly
- Restart policies: unless-stopped survives daemon restarts, always doesn't
- Multi-platform builds: docker buildx build --platform linux/amd64,linux/arm64