Introduction
# Container Debug
Debug running Docker containers and Compose services. Covers logs, exec, networking, resource inspection, multi-stage builds, health checks, and common failure patterns.
## When to Use
- Container exits immediately or crashes on start - Application inside container behaves differently than on host - Containers can't communicate with each other - Container is using too much memory or CPU - Multi-stage Docker build produces unexpected results - Health checks are failing - Compose services start in wrong order or can't connect
## Container Logs
### View and filter logs
```bash # Last 100 lines docker logs --tail 100 my-container
# Follow (stream) logs docker logs -f my-container
# Follow with timestamps docker logs -f -t my-container
# Logs since a time docker logs --since 30m my-container docker logs --since "2026-02-03T10:00:00" my-container
# Logs between times docker logs --since 1h --until 30m my-container
# Compose: logs for all services docker compose logs -f
# Compose: logs for specific service docker compose logs -f api db
# Redirect logs to file for analysis docker logs my-container > container.log 2>&1
# Separate stdout and stderr docker logs my-container > stdout.log 2> stderr.log ```
### Inspect log driver
```bash # Check what log driver a container uses docker inspect --format='{{.HostConfig.LogConfig.Type}}' my-container
# If json-file driver, find the actual log file docker inspect --format='{{.LogPath}}' my-container
# Check log file size ls -lh $(docker inspect --format='{{.LogPath}}' my-container) ```
## Exec Into Containers
### Interactive shell
```bash # Bash (most common) docker exec -it my-container bash
# If bash isn't available (Alpine, distroless) docker exec -it my-container sh
# As root (even if container runs as non-root user) docker exec -u root -it my-container bash
# With specific environment variables docker exec -e DEBUG=1 -it my-container bash
# Run a single command (no interactive shell) docker exec my-container cat /etc/os-release docker exec my-container ls -la /app/ docker exec my-container env ```
### Debug a crashed container
```bash # Container exited? Check exit code docker inspect --format='{{.State.ExitCode}}' my-container docker inspect --format='{{.State.Error}}' my-container
# Common exit codes: # 0 = clean exit # 1 = application error # 137 = killed (OOM or docker kill) — 128 + signal 9 # 139 = segfault — 128 + signal 11 # 143 = terminated (SIGTERM) — 128 + signal 15
# Start a stopped container to debug it docker start -ai my-container
# Or override the entrypoint to get a shell docker run -it --entrypoint sh my-image
# Copy files out of a stopped container docker cp my-container:/app/error.log ./error.log docker cp my-container:/etc/nginx/nginx.conf ./nginx.conf ```
### Debug without a shell (distroless / scratch images)
```bash # Use docker cp to extract files docker cp my-container:/app/config.json ./
# Use nsenter to get a shell in the container's namespace (Linux) PID=$(docker inspect --format='{{.State.Pid}}' my-container) nsenter -t $PID -m -u -i -n -p -- /bin/sh
# Attach a debug container to the same namespace docker run -it --pid=container:my-container --net=container:my-container busybox sh
# Docker Desktop: use debug extension docker debug my-container ```
## Networking
### Inspect container networking
```bash # Show container IP address docker inspect -f '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' my-container
# Show all network details docker inspect -f '{{json .NetworkSettings.Networks}}' my-container | jq
# List all networks docker network ls
# Inspect a network (see all connected containers) docker network inspect bridge docker network inspect my-compose-network
# Show port mappings docker port my-container ```
### Test connectivity between containers
```bash # From inside container A, reach container B docker exec container-a ping container-b docker exec container-a curl http://container-b:8080/health
# DNS resolution inside container docker exec my-container nslookup db docker exec my-container cat /etc/resolv.conf docker exec my-container cat /etc/hosts
# Test if port is reachable docker exec my-container nc -zv db 5432 docker exec my-container wget -qO- http://api:3000/health
# If curl/ping not available in container, install or use a debug container: docker run --rm --network container:my-container curlimages/curl curl -s http://localhost:8080 ```
### Common networking issues
```bash # "Connection refused" between containers # 1. Check the app binds to 0.0.0.0, not 127.0.0.1 docker exec my-container netstat -tlnp # If listening on 127.0.0.1 — fix the app config
# 2. Check containers are on the same network docker inspect -f '{{json .NetworkSettings.Networks}}' container-a | jq 'keys' docker inspect -f '{{json .NetworkSettings.Networks}}' container-b | jq 'keys'
# 3. Check published ports vs exposed ports # EXPOSE only documents, it doesn't publish # Use -p host:container to publish
# "Name not found" — DNS not resolving container names # Container names resolve only on user-defined networks, NOT the default bridge docker network create my-net docker run --network my-net --name api my-api-image docker run --network my-net --name db postgres # Now "api" and "db" resolve to each other ```
### Capture network traffic
```bash # tcpdump inside a container docker exec my-container tcpdump -i eth0 -n port 8080
# If tcpdump not available, use a sidecar docker run --rm --net=container:my-container nicolaka/netshoot tcpdump -i eth0 -n
# netshoot has: tcpdump, curl, nslookup, netstat, iperf, etc. docker run --rm --net=container:my-container nicolaka/netshoot bash ```
## Resource Usage
### Real-time stats
```bash # All containers docker stats
# Specific containers docker stats api db redis
# One-shot (no streaming) docker stats --no-stream
# Formatted output docker stats --format "table {{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}\t{{.NetIO}}" ```
### Memory investigation
```bash # Check memory limit docker inspect --format='{{.HostConfig.Memory}}' my-container # 0 means unlimited
# Check if container was OOM-killed docker inspect --format='{{.State.OOMKilled}}' my-container
# Memory usage breakdown (Linux cgroups) docker exec my-container cat /sys/fs/cgroup/memory.current 2>/dev/null || \ docker exec my-container cat /sys/fs/cgroup/memory/memory.usage_in_bytes
# Process memory inside container docker exec my-container ps aux --sort=-%mem | head -10 docker exec my-container top -bn1 ```
### Disk usage
```bash # Overall Docker disk usage docker system df docker system df -v
# Container filesystem size docker inspect --format='{{.SizeRw}}' my-container
# Find large files inside container docker exec my-container du -sh /* 2>/dev/null | sort -rh | head -10 docker exec my-container find /tmp -size +10M -type f
# Check for log file bloat docker exec my-container ls -lh /var/log/ ```
## Dockerfile Debugging
### Multi-stage build debugging
```bash # Build up to a specific stage docker build --target builder -t my-app:builder .
# Inspect what's in the builder stage docker run --rm -it my-app:builder sh docker run --rm my-app:builder ls -la /app/ docker run --rm my-app:builder cat /app/package.json
# Check which files made it to the final image docker run --rm my-image ls -laR /app/
# Build with no cache (fresh build) docker build --no-cache -t my-app .
# Build with progress output docker build --progress=plain -t my-app . ```
### Image inspection
```bash # Show image layers (size of each) docker history my-image docker history --no-trunc my-image
# Inspect image config (entrypoint, cmd, env, ports) docker inspect my-image | jq '.[0].Config | {Cmd, Entrypoint, Env, ExposedPorts, WorkingDir}'
# Compare two images docker history image-a --format "{{.Size}}\t{{.CreatedBy}}" > layers-a.txt docker history image-b --format "{{.Size}}\t{{.CreatedBy}}" > layers-b.txt diff layers-a.txt layers-b.txt
# Find what changed between builds docker diff my-container # A = added, C = changed, D = deleted ```
## Health Checks
### Define and debug health checks
```dockerfile # In Dockerfile HEALTHCHECK --interval=30s --timeout=10s --start-period=60s --retries=3 \ CMD curl -f http://localhost:8080/health || exit 1 ```
```bash # Check health status docker inspect --format='{{.State.Health.Status}}' my-container # "healthy", "unhealthy", or "starting"
# See health check log (last 5 results) docker inspect --format='{{json .State.Health}}' my-container | jq
# Run health check manually docker exec my-container curl -f http://localhost:8080/health
# Override health check at run time docker run --health-cmd "curl -f http://localhost:8080/health || exit 1" \ --health-interval 10s my-image
# Disable health check docker run --no-healthcheck my-image ```
## Docker Compose Debugging
### Service startup issues
```bash # Check service status docker compose ps
# See why a service failed docker compose logs failed-service
# Start with verbose output docker compose up --build 2>&1 | tee compose.log
# Start a single service (with dependencies) docker compose up db
# Start without dependencies docker compose up --no-deps api
# Recreate containers from scratch docker compose up --force-recreate --build
# Check effective config (after variable substitution) docker compose config ```
### Service dependency and startup order
```yaml # docker-compose.yml services: api: depends_on: db: condition: service_healthy redis: condition: service_started
db: image: postgres:16 healthcheck: test: ["CMD-SHELL", "pg_isready -U postgres"] interval: 5s timeout: 5s retries: 5
redis: image: redis:7 healthcheck: test: ["CMD", "redis-cli", "ping"] interval: 5s timeout: 5s retries: 5 ```
```bash # Wait for a service to be healthy before running commands docker compose up -d db docker compose exec db pg_isready # Polls until ready docker compose up -d api ```
## Cleanup
```bash # Remove stopped containers docker container prune
# Remove unused images docker image prune
# Remove everything unused (containers, images, networks, volumes) docker system prune -a
# Remove volumes too (WARNING: deletes data) docker system prune -a --volumes
# Remove dangling build cache docker builder prune ```
## Tips
- `docker logs -f` is the first thing to check. Most container failures are visible in the logs. - Exit code 137 means OOM-killed. Increase the memory limit or fix the memory leak. - Apps inside containers must bind to `0.0.0.0`, not `127.0.0.1`. Localhost inside a container is isolated. - Container names only resolve via DNS on user-defined networks, not the default `bridge`. Always create a custom network for multi-container setups. - `docker exec` only works on running containers. For crashed containers, use `docker cp` to extract logs or override the entrypoint with `docker run --entrypoint sh`. - `nicolaka/netshoot` is the Swiss Army knife for container networking. It has every networking tool pre-installed. - `--progress=plain` during builds shows full command output, which is essential for debugging build failures. - Health checks with `start-period` prevent false unhealthy status during slow application startup.