Tools: Container Security for SREs: The Practical Checklist

Tools: Container Security for SREs: The Practical Checklist

Security Is Part of Reliability

The Base Image Problem

The Security Checklist

1. Image Scanning

2. Non-Root Container

3. Network Policies

4. Secrets Management

6. Pod Security Standards

The Audit Automation SREs think about availability, latency, and throughput. But a security breach is just another type of incident often the worst kind. Here's the container security checklist I use. Smaller image = smaller attack surface. The multi-stage build removes build tools from the final image. Every finding becomes a ticket. No exceptions. If you want automated security monitoring for your container infrastructure, check out what we're building at Nova AI Ops. Written by Dr. Samson Tanimawo

BSc · MSc · MBA · PhD

Founder & CEO, Nova AI Ops. https://novaaiops.com Templates let you quickly answer FAQs or store snippets for re-use. Hide child comments as well For further actions, you may consider blocking this person and/or reporting abuse

Command

Copy

# Bad: 800MB image with everything including gcc FROM ubuntu:22.04 RUN -weight: 500;">apt-get -weight: 500;">update && -weight: 500;">apt-get -weight: 500;">install -y python3 python3--weight: 500;">pip COPY. /app RUN -weight: 500;">pip -weight: 500;">install -r requirements.txt # Good: 50MB image with only what's needed FROM python:3.11-slim AS builder COPY requirements.txt. RUN -weight: 500;">pip -weight: 500;">install --no-cache-dir -r requirements.txt FROM python:3.11-slim COPY --from=builder /usr/local/lib/python3.11/site-packages /usr/local/lib/python3.11/site-packages COPY. /app USER nobody # Bad: 800MB image with everything including gcc FROM ubuntu:22.04 RUN -weight: 500;">apt-get -weight: 500;">update && -weight: 500;">apt-get -weight: 500;">install -y python3 python3--weight: 500;">pip COPY. /app RUN -weight: 500;">pip -weight: 500;">install -r requirements.txt # Good: 50MB image with only what's needed FROM python:3.11-slim AS builder COPY requirements.txt. RUN -weight: 500;">pip -weight: 500;">install --no-cache-dir -r requirements.txt FROM python:3.11-slim COPY --from=builder /usr/local/lib/python3.11/site-packages /usr/local/lib/python3.11/site-packages COPY. /app USER nobody # Bad: 800MB image with everything including gcc FROM ubuntu:22.04 RUN -weight: 500;">apt-get -weight: 500;">update && -weight: 500;">apt-get -weight: 500;">install -y python3 python3--weight: 500;">pip COPY. /app RUN -weight: 500;">pip -weight: 500;">install -r requirements.txt # Good: 50MB image with only what's needed FROM python:3.11-slim AS builder COPY requirements.txt. RUN -weight: 500;">pip -weight: 500;">install --no-cache-dir -r requirements.txt FROM python:3.11-slim COPY --from=builder /usr/local/lib/python3.11/site-packages /usr/local/lib/python3.11/site-packages COPY. /app USER nobody # GitHub Actions: Scan before pushing - name: Scan image for vulnerabilities uses: aquasecurity/trivy-action@master with: image-ref: 'myapp:${{ github.sha }}' format: 'table' exit-code: '1' # Fail build on HIGH/CRITICAL severity: 'HIGH,CRITICAL' ignore-unfixed: true # Only fail on fixable vulns # GitHub Actions: Scan before pushing - name: Scan image for vulnerabilities uses: aquasecurity/trivy-action@master with: image-ref: 'myapp:${{ github.sha }}' format: 'table' exit-code: '1' # Fail build on HIGH/CRITICAL severity: 'HIGH,CRITICAL' ignore-unfixed: true # Only fail on fixable vulns # GitHub Actions: Scan before pushing - name: Scan image for vulnerabilities uses: aquasecurity/trivy-action@master with: image-ref: 'myapp:${{ github.sha }}' format: 'table' exit-code: '1' # Fail build on HIGH/CRITICAL severity: 'HIGH,CRITICAL' ignore-unfixed: true # Only fail on fixable vulns # Always run as non-root RUN addgroup --system app && adduser --system --ingroup app app USER app # Always run as non-root RUN addgroup --system app && adduser --system --ingroup app app USER app # Always run as non-root RUN addgroup --system app && adduser --system --ingroup app app USER app # Kubernetes: Enforce non-root securityContext: runAsNonRoot: true runAsUser: 1000 readOnlyRootFilesystem: true allowPrivilegeEscalation: false capabilities: drop: ["ALL"] # Kubernetes: Enforce non-root securityContext: runAsNonRoot: true runAsUser: 1000 readOnlyRootFilesystem: true allowPrivilegeEscalation: false capabilities: drop: ["ALL"] # Kubernetes: Enforce non-root securityContext: runAsNonRoot: true runAsUser: 1000 readOnlyRootFilesystem: true allowPrivilegeEscalation: false capabilities: drop: ["ALL"] # Default deny all traffic apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: default-deny-all spec: podSelector: {} policyTypes: - Ingress - Egress --- # Allow only specific traffic apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: api--weight: 500;">service-policy spec: podSelector: matchLabels: app: api--weight: 500;">service ingress: - from: - podSelector: matchLabels: app: nginx-ingress ports: - port: 8080 egress: - to: - podSelector: matchLabels: app: postgres ports: - port: 5432 # Default deny all traffic apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: default-deny-all spec: podSelector: {} policyTypes: - Ingress - Egress --- # Allow only specific traffic apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: api--weight: 500;">service-policy spec: podSelector: matchLabels: app: api--weight: 500;">service ingress: - from: - podSelector: matchLabels: app: nginx-ingress ports: - port: 8080 egress: - to: - podSelector: matchLabels: app: postgres ports: - port: 5432 # Default deny all traffic apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: default-deny-all spec: podSelector: {} policyTypes: - Ingress - Egress --- # Allow only specific traffic apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: api--weight: 500;">service-policy spec: podSelector: matchLabels: app: api--weight: 500;">service ingress: - from: - podSelector: matchLabels: app: nginx-ingress ports: - port: 8080 egress: - to: - podSelector: matchLabels: app: postgres ports: - port: 5432 # Bad: Secrets in environment variables env: - name: DB_PASSWORD value: "super-secret-password" # Visible in pod spec! # Good: Secrets from external vault env: - name: DB_PASSWORD valueFrom: secretKeyRef: name: db-credentials key: password # Best: Secrets injected from Vault annotations: vault.hashicorp.com/agent-inject: "true" vault.hashicorp.com/role: "api--weight: 500;">service" vault.hashicorp.com/agent-inject-secret-db: "secret/data/db" # Bad: Secrets in environment variables env: - name: DB_PASSWORD value: "super-secret-password" # Visible in pod spec! # Good: Secrets from external vault env: - name: DB_PASSWORD valueFrom: secretKeyRef: name: db-credentials key: password # Best: Secrets injected from Vault annotations: vault.hashicorp.com/agent-inject: "true" vault.hashicorp.com/role: "api--weight: 500;">service" vault.hashicorp.com/agent-inject-secret-db: "secret/data/db" # Bad: Secrets in environment variables env: - name: DB_PASSWORD value: "super-secret-password" # Visible in pod spec! # Good: Secrets from external vault env: - name: DB_PASSWORD valueFrom: secretKeyRef: name: db-credentials key: password # Best: Secrets injected from Vault annotations: vault.hashicorp.com/agent-inject: "true" vault.hashicorp.com/role: "api--weight: 500;">service" vault.hashicorp.com/agent-inject-secret-db: "secret/data/db" # Without limits, a compromised container can consume all resources resources: requests: cpu: 100m memory: 128Mi limits: cpu: 500m memory: 256Mi ephemeral-storage: 100Mi # Prevent disk filling attacks # Without limits, a compromised container can consume all resources resources: requests: cpu: 100m memory: 128Mi limits: cpu: 500m memory: 256Mi ephemeral-storage: 100Mi # Prevent disk filling attacks # Without limits, a compromised container can consume all resources resources: requests: cpu: 100m memory: 128Mi limits: cpu: 500m memory: 256Mi ephemeral-storage: 100Mi # Prevent disk filling attacks # Enforce restricted security standard at namespace level apiVersion: v1 kind: Namespace metadata: name: production labels: pod-security.kubernetes.io/enforce: restricted pod-security.kubernetes.io/audit: restricted pod-security.kubernetes.io/warn: restricted # Enforce restricted security standard at namespace level apiVersion: v1 kind: Namespace metadata: name: production labels: pod-security.kubernetes.io/enforce: restricted pod-security.kubernetes.io/audit: restricted pod-security.kubernetes.io/warn: restricted # Enforce restricted security standard at namespace level apiVersion: v1 kind: Namespace metadata: name: production labels: pod-security.kubernetes.io/enforce: restricted pod-security.kubernetes.io/audit: restricted pod-security.kubernetes.io/warn: restricted #!/bin/bash # weekly-security-audit.sh echo "=== Image Age Check ===" -weight: 500;">kubectl get pods -A -o json | jq -r '.items[] |.spec.containers[] |.image' | sort -u | while read img; do age=$(skopeo inspect -weight: 500;">docker://$img 2>/dev/null | jq -r '.Created') echo "$img built: $age" done echo "=== Privileged Containers ===" -weight: 500;">kubectl get pods -A -o json | jq -r '.items[] | select(.spec.containers[].securityContext.privileged == true) |.metadata.namespace + "/" +.metadata.name' echo "=== Containers Running as Root ===" -weight: 500;">kubectl get pods -A -o json | jq -r '.items[] | select(.spec.securityContext.runAsNonRoot!= true) |.metadata.namespace + "/" +.metadata.name' echo "=== Missing Resource Limits ===" -weight: 500;">kubectl get pods -A -o json | jq -r '.items[] | select(.spec.containers[].resources.limits == null) |.metadata.namespace + "/" +.metadata.name' #!/bin/bash # weekly-security-audit.sh echo "=== Image Age Check ===" -weight: 500;">kubectl get pods -A -o json | jq -r '.items[] |.spec.containers[] |.image' | sort -u | while read img; do age=$(skopeo inspect -weight: 500;">docker://$img 2>/dev/null | jq -r '.Created') echo "$img built: $age" done echo "=== Privileged Containers ===" -weight: 500;">kubectl get pods -A -o json | jq -r '.items[] | select(.spec.containers[].securityContext.privileged == true) |.metadata.namespace + "/" +.metadata.name' echo "=== Containers Running as Root ===" -weight: 500;">kubectl get pods -A -o json | jq -r '.items[] | select(.spec.securityContext.runAsNonRoot!= true) |.metadata.namespace + "/" +.metadata.name' echo "=== Missing Resource Limits ===" -weight: 500;">kubectl get pods -A -o json | jq -r '.items[] | select(.spec.containers[].resources.limits == null) |.metadata.namespace + "/" +.metadata.name' #!/bin/bash # weekly-security-audit.sh echo "=== Image Age Check ===" -weight: 500;">kubectl get pods -A -o json | jq -r '.items[] |.spec.containers[] |.image' | sort -u | while read img; do age=$(skopeo inspect -weight: 500;">docker://$img 2>/dev/null | jq -r '.Created') echo "$img built: $age" done echo "=== Privileged Containers ===" -weight: 500;">kubectl get pods -A -o json | jq -r '.items[] | select(.spec.containers[].securityContext.privileged == true) |.metadata.namespace + "/" +.metadata.name' echo "=== Containers Running as Root ===" -weight: 500;">kubectl get pods -A -o json | jq -r '.items[] | select(.spec.securityContext.runAsNonRoot!= true) |.metadata.namespace + "/" +.metadata.name' echo "=== Missing Resource Limits ===" -weight: 500;">kubectl get pods -A -o json | jq -r '.items[] | select(.spec.containers[].resources.limits == null) |.metadata.namespace + "/" +.metadata.name'