# top command
top # htop command (you might need to -weight: 500;">install it: -weight: 600;">sudo -weight: 500;">apt -weight: 500;">install htop)
htop
# top command
top # htop command (you might need to -weight: 500;">install it: -weight: 600;">sudo -weight: 500;">apt -weight: 500;">install htop)
htop
iostat -xz 1
iostat -xz 1
-weight: 500;">docker stats
-weight: 500;">docker stats
-weight: 500;">docker inspect my_problematic_container | grep -i "memory\|cpu"
-weight: 500;">docker inspect my_problematic_container | grep -i "memory\|cpu"
journalctl -u -weight: 500;">docker.-weight: 500;">service --since "1 hour ago" | grep -i "oom"
journalctl -u -weight: 500;">docker.-weight: 500;">service --since "1 hour ago" | grep -i "oom"
grep -i "oom" /var/log/kern.log
# or
dmesg | grep -i "oom"
grep -i "oom" /var/log/kern.log
# or
dmesg | grep -i "oom"
-weight: 500;">docker run -d --name my-app-limited --memory "512m" my-image
-weight: 500;">docker run -d --name my-app-limited --memory "512m" my-image
# 512MB RAM, 512MB Swap (1GB total)
-weight: 500;">docker run -d --name my-app-swap --memory "512m" --memory-swap "1g" my-image # 512MB RAM, no Swap usage
-weight: 500;">docker run -d --name my-app-no-swap --memory "512m" --memory-swap "512m" my-image
# 512MB RAM, 512MB Swap (1GB total)
-weight: 500;">docker run -d --name my-app-swap --memory "512m" --memory-swap "1g" my-image # 512MB RAM, no Swap usage
-weight: 500;">docker run -d --name my-app-no-swap --memory "512m" --memory-swap "512m" my-image
-weight: 500;">docker run -d --name my-app-swappiness --memory "512m" --memory-swappiness 10 my-image
-weight: 500;">docker run -d --name my-app-swappiness --memory "512m" --memory-swappiness 10 my-image
-weight: 500;">docker run -d --name my-app-soft-limit --memory "1g" --memory-reservation "512m" my-image
-weight: 500;">docker run -d --name my-app-soft-limit --memory "1g" --memory-reservation "512m" my-image
-weight: 500;">docker run -d --name my-cpu-app --cpus "0.5" my-image
-weight: 500;">docker run -d --name my-cpu-app --cpus "0.5" my-image
# If one container runs with 1024 shares and another with 512, the first gets twice the CPU of the second.
-weight: 500;">docker run -d --name my-cpu-share-high --cpu-shares 1024 my-image
-weight: 500;">docker run -d --name my-cpu-share-low --cpu-shares 512 my-image
# If one container runs with 1024 shares and another with 512, the first gets twice the CPU of the second.
-weight: 500;">docker run -d --name my-cpu-share-high --cpu-shares 1024 my-image
-weight: 500;">docker run -d --name my-cpu-share-low --cpu-shares 512 my-image
# The container can use 50ms of CPU every 100ms (100000 microseconds) (50% CPU)
-weight: 500;">docker run -d --name my-cpu-quota --cpu-period 100000 --cpu-quota 50000 my-image
# The container can use 50ms of CPU every 100ms (100000 microseconds) (50% CPU)
-weight: 500;">docker run -d --name my-cpu-quota --cpu-period 100000 --cpu-quota 50000 my-image
# Container should run only on CPU 0 and 1
-weight: 500;">docker run -d --name my-cpuset-app --cpuset-cpus "0,1" my-image
# Container should run only on CPU 0 and 1
-weight: 500;">docker run -d --name my-cpuset-app --cpuset-cpus "0,1" my-image
-weight: 500;">docker run -d --name my-io-app --blkio-weight 400 my-image
-weight: 500;">docker run -d --name my-io-app --blkio-weight 400 my-image
# Limit read speed from /dev/sda to 1MB/s
-weight: 500;">docker run -d --name my-read-limit --device-read-bps /dev/sda:1mb my-image # Limit write speed to /dev/sda to 500KB/s
-weight: 500;">docker run -d --name my-write-limit --device-write-bps /dev/sda:500kb my-image
# Limit read speed from /dev/sda to 1MB/s
-weight: 500;">docker run -d --name my-read-limit --device-read-bps /dev/sda:1mb my-image # Limit write speed to /dev/sda to 500KB/s
-weight: 500;">docker run -d --name my-write-limit --device-write-bps /dev/sda:500kb my-image
# Update the memory limit of a running container to 1GB
-weight: 500;">docker -weight: 500;">update --memory "1g" my-problematic-container # Update the CPU limit of a running container to 0.75 cores
-weight: 500;">docker -weight: 500;">update --cpus "0.75" my-problematic-container
# Update the memory limit of a running container to 1GB
-weight: 500;">docker -weight: 500;">update --memory "1g" my-problematic-container # Update the CPU limit of a running container to 0.75 cores
-weight: 500;">docker -weight: 500;">update --cpus "0.75" my-problematic-container
# Update the memory limit of a running container to 1GB
-weight: 500;">docker -weight: 500;">update --memory "1g" my-problematic-container # Update the CPU limit of a running container to 0.75 cores
-weight: 500;">docker -weight: 500;">update --cpus "0.75" my-problematic-container
-weight: 500;">docker stats my-problematic_container
-weight: 500;">docker stats my-problematic_container
# To find the exact cgroup path of the container
CONTAINER_ID=$(-weight: 500;">docker inspect -f '{{.Id}}' my-problematic_container)
echo "/sys/fs/cgroup/memory/-weight: 500;">docker/$CONTAINER_ID" # Check memory limit
cat /sys/fs/cgroup/memory/-weight: 500;">docker/$CONTAINER_ID/memory.limit_in_bytes # Check CPU quota and period values
cat /sys/fs/cgroup/cpu/-weight: 500;">docker/$CONTAINER_ID/cpu.cfs_quota_us
cat /sys/fs/cgroup/cpu/-weight: 500;">docker/$CONTAINER_ID/cpu.cfs_period_us
# To find the exact cgroup path of the container
CONTAINER_ID=$(-weight: 500;">docker inspect -f '{{.Id}}' my-problematic_container)
echo "/sys/fs/cgroup/memory/-weight: 500;">docker/$CONTAINER_ID" # Check memory limit
cat /sys/fs/cgroup/memory/-weight: 500;">docker/$CONTAINER_ID/memory.limit_in_bytes # Check CPU quota and period values
cat /sys/fs/cgroup/cpu/-weight: 500;">docker/$CONTAINER_ID/cpu.cfs_quota_us
cat /sys/fs/cgroup/cpu/-weight: 500;">docker/$CONTAINER_ID/cpu.cfs_period_us - System-Wide Slowdown: The entire server becomes unresponsive, commands execute slowly, and network connections lag.
- Application Errors or Delays: Applications I'm running (e.g., operator screens in a production ERP) respond slower than expected or generate timeout errors.
- OOM-Killed Processes: Seeing Out of Memory (OOM) killer messages in journald logs or dmesg output, which are usually triggered by memory exhaustion.
- High Disk I/O: Disk activity significantly exceeds normal levels, and a noticeable increase is observed in iostat output. This is particularly common in containers that write logs or perform intensive database operations.
- Increased Error Rates: The error rates of a -weight: 500;">service behind an Nginx reverse proxy rise because the backend container cannot respond to incoming requests in time. - top or htop: These tools show the real-time -weight: 500;">status of CPU, memory, and running processes. They are excellent for quickly seeing which processes are consuming the most resources. htop is more interactive and easier to understand due to its colorful interface. # top command
top # htop command (you might need to -weight: 500;">install it: -weight: 600;">sudo -weight: 500;">apt -weight: 500;">install htop)
htop In the top output, I pay attention to the %CPU and %MEM columns. Abnormally high values indicate a potential source of problems.
- free -h: Displays memory usage in a human-readable format. The total, used, free, buff/cache, and available columns are important. free -h A drop in the available value to critical levels is a sign that the system will soon -weight: 500;">start using swap or that the OOM Killer will intervene.
- iostat -xz 1: Shows disk I/O activity. I particularly look at await, %util, r/s, and w/s values. A high %util value indicates that the disk is excessively busy. iostat -xz 1 Unusually high r/s (read requests per second) and w/s (write requests per second) values can indicate an application heavily using the disk. I once identified a logging container completely saturating disk I/O using this command.
- vmstat 1: I use this to monitor virtual memory statistics and overall system activity. The r (running processes), b (blocked processes), swpd (used swap), free (free memory), si (swap in), so (swap out) columns are important. vmstat 1 Constantly high si and so values indicate that the system is exhausting its memory and heavily swapping to disk, which severely degrades performance. - -weight: 500;">docker stats: Displays real-time CPU, memory, network I/O, and disk I/O usage for a specific container or all containers. This command is the fastest way to pinpoint the resource hog. -weight: 500;">docker stats In the output, I focus on the CPU %, MEM %, MEM USAGE / LIMIT, and IO columns. Seeing a container's MEM USAGE value approaching or exceeding its LIMIT indicates that I need to intervene immediately.
- -weight: 500;">docker inspect <container_id_or_name>: Shows detailed configuration information for a container, especially cgroup settings under HostConfig. This is important for understanding what limits are defined for the container. -weight: 500;">docker inspect my_problematic_container | grep -i "memory\|cpu" With this command, I can see settings like Memory, CpuShares, CpuQuota defined for the container. If these settings have not been made, the container has the potential for unlimited resource consumption.
- journalctl -u -weight: 500;">docker.-weight: 500;">service: I examine the Docker daemon's own logs. I can find messages here about the OOM Killer terminating containers. journalctl -u -weight: 500;">docker.-weight: 500;">service --since "1 hour ago" | grep -i "oom" Sometimes, I see OOM errors during a build process or a container constantly restarting in these logs.
- OOM Events in Kernel Logs: Examining kernel logs directly can also be useful. grep -i "oom" /var/log/kern.log
# or
dmesg | grep -i "oom" These logs more clearly show system-wide memory issues and which processes were targeted by the OOM Killer. - --memory (or -m): Specifies the maximum amount of memory the container can use. This is a hard limit. If the container exceeds this limit, it will be terminated by the OOM Killer. -weight: 500;">docker run -d --name my-app-limited --memory "512m" my-image This command allows the my-app-limited container to use a maximum of 512 MB of RAM.
- --memory-swap: Used in conjunction with --memory. It determines the total memory (RAM + swap space) the container can use. If --memory-swap is greater than --memory, the container can use swap space equal to the difference. If --memory-swap is equal to --memory, the container cannot use any swap. A value of -1 means unlimited swap. # 512MB RAM, 512MB Swap (1GB total)
-weight: 500;">docker run -d --name my-app-swap --memory "512m" --memory-swap "1g" my-image # 512MB RAM, no Swap usage
-weight: 500;">docker run -d --name my-app-no-swap --memory "512m" --memory-swap "512m" my-image I once saw a Node.js application completely fill the system's swap space due to a memory leak. Carefully setting the --memory-swap limit prevented such situations.
- --memory-swappiness: Controls the Linux kernel's swappiness setting at the container level (between 0 and 100). Lower values reduce swap usage, while higher values increase it. -weight: 500;">docker run -d --name my-app-swappiness --memory "512m" --memory-swappiness 10 my-image
- --memory-reservation: This is a soft limit (memory.high in cgroup). The container can exceed this value if there is no memory pressure on the system, but it will try to drop to this reservation level when the system needs memory. -weight: 500;">docker run -d --name my-app-soft-limit --memory "1g" --memory-reservation "512m" my-image This setting is very useful for managing sudden memory spikes in a container, especially when configuring connection pools for applications like PostgreSQL. - --cpus: Directly specifies the number of CPU cores the container can use. For example, 1.5 means one and a half cores. -weight: 500;">docker run -d --name my-cpu-app --cpus "0.5" my-image This means the container can use 50% of the total CPU resources.
- --cpu-shares: Relative weight for the CPU scheduler (default 1024). Higher values allow the container to receive more CPU time. This is a ratio, not an absolute limit. # If one container runs with 1024 shares and another with 512, the first gets twice the CPU of the second.
-weight: 500;">docker run -d --name my-cpu-share-high --cpu-shares 1024 my-image
-weight: 500;">docker run -d --name my-cpu-share-low --cpu-shares 512 my-image
- --cpu-period and --cpu-quota: These two are used together to limit CPU usage as a percentage. cpu-period (default 100000 microseconds) defines a time period, and cpu-quota defines how much CPU time the container can get within that period. # The container can use 50ms of CPU every 100ms (100000 microseconds) (50% CPU)
-weight: 500;">docker run -d --name my-cpu-quota --cpu-period 100000 --cpu-quota 50000 my-image This method provides an absolute limit similar to --cpus.
- --cpuset-cpus: Specifies the particular CPU cores on which the container will run. This is useful especially for applications requiring CPU cache optimization or needing to run on specific hardware. # Container should run only on CPU 0 and 1
-weight: 500;">docker run -d --name my-cpuset-app --cpuset-cpus "0,1" my-image At one point, I used this setting when I needed to pin certain real-time workloads to specific cores. - --blkio-weight: Sets the relative weight given to the container for I/O operations (between 10 and 1000, default 0). A higher weight means more I/O time. -weight: 500;">docker run -d --name my-io-app --blkio-weight 400 my-image
- --device-read-bps / --device-write-bps: Limits the read/write speed for a specific device in bytes per second (bps). # Limit read speed from /dev/sda to 1MB/s
-weight: 500;">docker run -d --name my-read-limit --device-read-bps /dev/sda:1mb my-image # Limit write speed to /dev/sda to 500KB/s
-weight: 500;">docker run -d --name my-write-limit --device-write-bps /dev/sda:500kb my-image I used these limits on my own VPS to prevent a backup script container from excessively stressing the disk. Otherwise, my other services were slowing down due to waiting for disk I/O. - Verification with -weight: 500;">docker stats: After applying limits, I run the -weight: 500;">docker stats command again to check if MEM USAGE / LIMIT and CPU % values remain within the expected boundaries. -weight: 500;">docker stats my-problematic_container
- Manual Inspection of the cgroup File System: Sometimes, Docker's interface might not be enough. I directly inspect the cgroup file system to confirm how limits are applied at the kernel level. # To find the exact cgroup path of the container
CONTAINER_ID=$(-weight: 500;">docker inspect -f '{{.Id}}' my-problematic_container)
echo "/sys/fs/cgroup/memory/-weight: 500;">docker/$CONTAINER_ID" # Check memory limit
cat /sys/fs/cgroup/memory/-weight: 500;">docker/$CONTAINER_ID/memory.limit_in_bytes # Check CPU quota and period values
cat /sys/fs/cgroup/cpu/-weight: 500;">docker/$CONTAINER_ID/cpu.cfs_quota_us
cat /sys/fs/cgroup/cpu/-weight: 500;">docker/$CONTAINER_ID/cpu.cfs_period_us
- Following Logs: Application logs and journald logs show how the application behaves under the limits. Seeing OOM Killer messages decrease or disappear entirely is a sign that I'm on the right track.
- Fine-Tuning: It's difficult to set perfect limits in one go. I usually make gradual adjustments by observing the application's behavior under normal and heavy loads. For example, when deploying a new AI model in a production ERP, I closely monitored the model's memory and CPU consumption, and after a certain period, I tightened the limits a bit further. Trial and error and continuous observation are key in this process. - Performance Degradation due to Incorrect Limits: If I allocate too few resources to a container, the application will constantly throttle, slow down, or crash. This directly impacts user experience. For example, when configuring connection pools for PostgreSQL, if I set the memory.high soft limit too low, I observed a decrease in database performance.
- Hard Limit vs. Soft Limit Choices: A hard limit (--memory, --cpus) provides a guaranteed upper bound but restricts the application's flexibility during sudden needs. A soft limit (--memory-reservation, --cpu-shares), on the other hand, offers flexibility but can degrade application performance when there's memory pressure on the system. It's crucial to find the right balance based on the application's criticality and behavior.
- Unexpected Effects of the OOM Killer: When a hard limit is exceeded, the OOM Killer intervenes and terminates the container. This can cause the application to -weight: 500;">stop suddenly and unexpectedly. Therefore, setting up monitoring and alerting mechanisms for critical applications is essential.
- Cost of Allocating Excessive Resources: Especially on cloud-based VPSs, allocating more resources than necessary directly increases costs. The primary purpose of a VPS is to use resources efficiently. Therefore, allocating only what each container needs is critical for both cost and overall system efficiency. In my own side product, I rigorously track these limits to optimize the VPS cost.
- Importance of cgroup memory.high Soft Limit: The memory.high (set with --memory-reservation in Docker) is very useful. A container tries to clear its page cache to fall below this limit, acting as a "warning" mechanism before the OOM Killer's harsh intervention. This allows the application to proactively reduce its memory usage and helps the system run more stably.