ClawSkills logoClawSkills

Container Debug

Debug running Docker containers and Compose services. Use when inspecting container logs, exec-ing into running containers, diagnosing networking issues, checki

Introduction

# Container Debug

Debug running Docker containers and Compose services. Covers logs, exec, networking, resource inspection, multi-stage builds, health checks, and common failure patterns.

## When to Use

- Container exits immediately or crashes on start - Application inside container behaves differently than on host - Containers can't communicate with each other - Container is using too much memory or CPU - Multi-stage Docker build produces unexpected results - Health checks are failing - Compose services start in wrong order or can't connect

## Container Logs

### View and filter logs

```bash # Last 100 lines docker logs --tail 100 my-container

# Follow (stream) logs docker logs -f my-container

# Follow with timestamps docker logs -f -t my-container

# Logs since a time docker logs --since 30m my-container docker logs --since "2026-02-03T10:00:00" my-container

# Logs between times docker logs --since 1h --until 30m my-container

# Compose: logs for all services docker compose logs -f

# Compose: logs for specific service docker compose logs -f api db

# Redirect logs to file for analysis docker logs my-container > container.log 2>&1

# Separate stdout and stderr docker logs my-container > stdout.log 2> stderr.log ```

### Inspect log driver

```bash # Check what log driver a container uses docker inspect --format='{{.HostConfig.LogConfig.Type}}' my-container

# If json-file driver, find the actual log file docker inspect --format='{{.LogPath}}' my-container

# Check log file size ls -lh $(docker inspect --format='{{.LogPath}}' my-container) ```

## Exec Into Containers

### Interactive shell

```bash # Bash (most common) docker exec -it my-container bash

# If bash isn't available (Alpine, distroless) docker exec -it my-container sh

# As root (even if container runs as non-root user) docker exec -u root -it my-container bash

# With specific environment variables docker exec -e DEBUG=1 -it my-container bash

# Run a single command (no interactive shell) docker exec my-container cat /etc/os-release docker exec my-container ls -la /app/ docker exec my-container env ```

### Debug a crashed container

```bash # Container exited? Check exit code docker inspect --format='{{.State.ExitCode}}' my-container docker inspect --format='{{.State.Error}}' my-container

# Common exit codes: # 0 = clean exit # 1 = application error # 137 = killed (OOM or docker kill) — 128 + signal 9 # 139 = segfault — 128 + signal 11 # 143 = terminated (SIGTERM) — 128 + signal 15

# Start a stopped container to debug it docker start -ai my-container

# Or override the entrypoint to get a shell docker run -it --entrypoint sh my-image

# Copy files out of a stopped container docker cp my-container:/app/error.log ./error.log docker cp my-container:/etc/nginx/nginx.conf ./nginx.conf ```

### Debug without a shell (distroless / scratch images)

```bash # Use docker cp to extract files docker cp my-container:/app/config.json ./

# Use nsenter to get a shell in the container's namespace (Linux) PID=$(docker inspect --format='{{.State.Pid}}' my-container) nsenter -t $PID -m -u -i -n -p -- /bin/sh

# Attach a debug container to the same namespace docker run -it --pid=container:my-container --net=container:my-container busybox sh

# Docker Desktop: use debug extension docker debug my-container ```

## Networking

### Inspect container networking

```bash # Show container IP address docker inspect -f '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' my-container

# Show all network details docker inspect -f '{{json .NetworkSettings.Networks}}' my-container | jq

# List all networks docker network ls

# Inspect a network (see all connected containers) docker network inspect bridge docker network inspect my-compose-network

# Show port mappings docker port my-container ```

### Test connectivity between containers

```bash # From inside container A, reach container B docker exec container-a ping container-b docker exec container-a curl http://container-b:8080/health

# DNS resolution inside container docker exec my-container nslookup db docker exec my-container cat /etc/resolv.conf docker exec my-container cat /etc/hosts

# Test if port is reachable docker exec my-container nc -zv db 5432 docker exec my-container wget -qO- http://api:3000/health

# If curl/ping not available in container, install or use a debug container: docker run --rm --network container:my-container curlimages/curl curl -s http://localhost:8080 ```

### Common networking issues

```bash # "Connection refused" between containers # 1. Check the app binds to 0.0.0.0, not 127.0.0.1 docker exec my-container netstat -tlnp # If listening on 127.0.0.1 — fix the app config

# 2. Check containers are on the same network docker inspect -f '{{json .NetworkSettings.Networks}}' container-a | jq 'keys' docker inspect -f '{{json .NetworkSettings.Networks}}' container-b | jq 'keys'

# 3. Check published ports vs exposed ports # EXPOSE only documents, it doesn't publish # Use -p host:container to publish

# "Name not found" — DNS not resolving container names # Container names resolve only on user-defined networks, NOT the default bridge docker network create my-net docker run --network my-net --name api my-api-image docker run --network my-net --name db postgres # Now "api" and "db" resolve to each other ```

### Capture network traffic

```bash # tcpdump inside a container docker exec my-container tcpdump -i eth0 -n port 8080

# If tcpdump not available, use a sidecar docker run --rm --net=container:my-container nicolaka/netshoot tcpdump -i eth0 -n

# netshoot has: tcpdump, curl, nslookup, netstat, iperf, etc. docker run --rm --net=container:my-container nicolaka/netshoot bash ```

## Resource Usage

### Real-time stats

```bash # All containers docker stats

# Specific containers docker stats api db redis

# One-shot (no streaming) docker stats --no-stream

# Formatted output docker stats --format "table {{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}\t{{.NetIO}}" ```

### Memory investigation

```bash # Check memory limit docker inspect --format='{{.HostConfig.Memory}}' my-container # 0 means unlimited

# Check if container was OOM-killed docker inspect --format='{{.State.OOMKilled}}' my-container

# Memory usage breakdown (Linux cgroups) docker exec my-container cat /sys/fs/cgroup/memory.current 2>/dev/null || \ docker exec my-container cat /sys/fs/cgroup/memory/memory.usage_in_bytes

# Process memory inside container docker exec my-container ps aux --sort=-%mem | head -10 docker exec my-container top -bn1 ```

### Disk usage

```bash # Overall Docker disk usage docker system df docker system df -v

# Container filesystem size docker inspect --format='{{.SizeRw}}' my-container

# Find large files inside container docker exec my-container du -sh /* 2>/dev/null | sort -rh | head -10 docker exec my-container find /tmp -size +10M -type f

# Check for log file bloat docker exec my-container ls -lh /var/log/ ```

## Dockerfile Debugging

### Multi-stage build debugging

```bash # Build up to a specific stage docker build --target builder -t my-app:builder .

# Inspect what's in the builder stage docker run --rm -it my-app:builder sh docker run --rm my-app:builder ls -la /app/ docker run --rm my-app:builder cat /app/package.json

# Check which files made it to the final image docker run --rm my-image ls -laR /app/

# Build with no cache (fresh build) docker build --no-cache -t my-app .

# Build with progress output docker build --progress=plain -t my-app . ```

### Image inspection

```bash # Show image layers (size of each) docker history my-image docker history --no-trunc my-image

# Inspect image config (entrypoint, cmd, env, ports) docker inspect my-image | jq '.[0].Config | {Cmd, Entrypoint, Env, ExposedPorts, WorkingDir}'

# Compare two images docker history image-a --format "{{.Size}}\t{{.CreatedBy}}" > layers-a.txt docker history image-b --format "{{.Size}}\t{{.CreatedBy}}" > layers-b.txt diff layers-a.txt layers-b.txt

# Find what changed between builds docker diff my-container # A = added, C = changed, D = deleted ```

## Health Checks

### Define and debug health checks

```dockerfile # In Dockerfile HEALTHCHECK --interval=30s --timeout=10s --start-period=60s --retries=3 \ CMD curl -f http://localhost:8080/health || exit 1 ```

```bash # Check health status docker inspect --format='{{.State.Health.Status}}' my-container # "healthy", "unhealthy", or "starting"

# See health check log (last 5 results) docker inspect --format='{{json .State.Health}}' my-container | jq

# Run health check manually docker exec my-container curl -f http://localhost:8080/health

# Override health check at run time docker run --health-cmd "curl -f http://localhost:8080/health || exit 1" \ --health-interval 10s my-image

# Disable health check docker run --no-healthcheck my-image ```

## Docker Compose Debugging

### Service startup issues

```bash # Check service status docker compose ps

# See why a service failed docker compose logs failed-service

# Start with verbose output docker compose up --build 2>&1 | tee compose.log

# Start a single service (with dependencies) docker compose up db

# Start without dependencies docker compose up --no-deps api

# Recreate containers from scratch docker compose up --force-recreate --build

# Check effective config (after variable substitution) docker compose config ```

### Service dependency and startup order

```yaml # docker-compose.yml services: api: depends_on: db: condition: service_healthy redis: condition: service_started

db: image: postgres:16 healthcheck: test: ["CMD-SHELL", "pg_isready -U postgres"] interval: 5s timeout: 5s retries: 5

redis: image: redis:7 healthcheck: test: ["CMD", "redis-cli", "ping"] interval: 5s timeout: 5s retries: 5 ```

```bash # Wait for a service to be healthy before running commands docker compose up -d db docker compose exec db pg_isready # Polls until ready docker compose up -d api ```

## Cleanup

```bash # Remove stopped containers docker container prune

# Remove unused images docker image prune

# Remove everything unused (containers, images, networks, volumes) docker system prune -a

# Remove volumes too (WARNING: deletes data) docker system prune -a --volumes

# Remove dangling build cache docker builder prune ```

## Tips

- `docker logs -f` is the first thing to check. Most container failures are visible in the logs. - Exit code 137 means OOM-killed. Increase the memory limit or fix the memory leak. - Apps inside containers must bind to `0.0.0.0`, not `127.0.0.1`. Localhost inside a container is isolated. - Container names only resolve via DNS on user-defined networks, not the default `bridge`. Always create a custom network for multi-container setups. - `docker exec` only works on running containers. For crashed containers, use `docker cp` to extract logs or override the entrypoint with `docker run --entrypoint sh`. - `nicolaka/netshoot` is the Swiss Army knife for container networking. It has every networking tool pre-installed. - `--progress=plain` during builds shows full command output, which is essential for debugging build failures. - Health checks with `start-period` prevent false unhealthy status during slow application startup.

More Products