Stop Writing Shell Scripts for Container Health Checks

Stop Writing Shell Scripts for Container Health Checks

Source: Dev.to

One command, clear failure reasons ## Why predictable checks matter (containers, CI, everywhere) ## Multi-stage builds bonus: executable documentation ## What Preflight checks ## Three places it immediately helps ## 1) Build-time validation (fail fast) ## 2) CI validation (same checks outside Docker) ## 3) Healthchecks without curl (distroless-friendly) ## When shell is fine — and when it bites you It started as one of those “this will take 30 seconds” moments. We ship a container that includes a tiny helper binary — something we compile in a builder stage and COPY into the runtime image. Think: config-render, migrate-db, probe, whatever your service depends on at startup. I just wanted to make sure the image really contained the helper I thought it contained, so I added: Green build. Ship it. Done. ..until a pipeline failed later with an error that told me basically nothing useful. So I did what we all do: I started “hardening” the check. And suddenly my “simple validation” became a little bash pipeline that tried to be clever: It worked... for now. But this is where validation quietly becomes a maintenance trap: At some point I realized: I don’t want “a shell script that hopefully detects the problem.” I want a single check that tells me exactly why it failed. That’s what Preflight is for. Instead of assembling checks out of grep + awk, you run: When it fails, you get the reason: No guessing. No “why did this randomly break today”. This isn’t just a “containers are minimal” problem. It’s a reliability problem. Shell-based checks are notoriously sensitive to environment: A small binary with a narrow job tends to behave the same way everywhere. That consistency is what you actually want from validation: predictable pass/fail and predictable output. There’s another benefit I didn’t appreciate until later: in a multi-stage Dockerfile, these checks become documentation that runs. you’re not just “testing stuff”. You’re encoding expectations: It reads like a contract - and it fails like one too. The above command check was just the beginning. Preflight now supports: Shell is totally fine when: Preflight pays off when: Repo: https://github.com/vertti/preflight Templates let you quickly answer FAQs or store snippets for re-use. Are you sure you want to hide this comment? It will become hidden in your post, but will still be visible via the comment's permalink. Hide child comments as well For further actions, you may consider blocking this person and/or reporting abuse CODE_BLOCK: RUN config-render --version Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: RUN config-render --version CODE_BLOCK: RUN config-render --version COMMAND_BLOCK: command -v config-render >/dev/null 2>&1 && \ config-render --version 2>/dev/null | \ grep -oE '[0-9]+\.[0-9]+\.[0-9]+' && \ # ...some version comparison logic... \ true || \ (echo "config-render missing or wrong version"; exit 1) Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK: command -v config-render >/dev/null 2>&1 && \ config-render --version 2>/dev/null | \ grep -oE '[0-9]+\.[0-9]+\.[0-9]+' && \ # ...some version comparison logic... \ true || \ (echo "config-render missing or wrong version"; exit 1) COMMAND_BLOCK: command -v config-render >/dev/null 2>&1 && \ config-render --version 2>/dev/null | \ grep -oE '[0-9]+\.[0-9]+\.[0-9]+' && \ # ...some version comparison logic... \ true || \ (echo "config-render missing or wrong version"; exit 1) CODE_BLOCK: preflight cmd config-render --min 1.4.0 Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: preflight cmd config-render --min 1.4.0 CODE_BLOCK: preflight cmd config-render --min 1.4.0 CODE_BLOCK: [OK] cmd: config-render path: /usr/local/bin/config-render version: 1.6.2 Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: [OK] cmd: config-render path: /usr/local/bin/config-render version: 1.6.2 CODE_BLOCK: [OK] cmd: config-render path: /usr/local/bin/config-render version: 1.6.2 CODE_BLOCK: [FAIL] cmd: config-render failed to execute: exec format error Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: [FAIL] cmd: config-render failed to execute: exec format error CODE_BLOCK: [FAIL] cmd: config-render failed to execute: exec format error CODE_BLOCK: [FAIL] cmd: config-render version 1.2.0 < minimum 1.4.0 Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: [FAIL] cmd: config-render version 1.2.0 < minimum 1.4.0 CODE_BLOCK: [FAIL] cmd: config-render version 1.2.0 < minimum 1.4.0 CODE_BLOCK: RUN preflight cmd config-render --min 1.4.0 RUN preflight env DATABASE_URL RUN preflight file /app/config.yaml --not-empty Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: RUN preflight cmd config-render --min 1.4.0 RUN preflight env DATABASE_URL RUN preflight file /app/config.yaml --not-empty CODE_BLOCK: RUN preflight cmd config-render --min 1.4.0 RUN preflight env DATABASE_URL RUN preflight file /app/config.yaml --not-empty CODE_BLOCK: COPY --from=ghcr.io/vertti/preflight:latest /preflight /usr/local/bin/preflight RUN preflight cmd config-render --min 1.4.0 RUN preflight env DATABASE_URL RUN preflight file /app/config.yaml --not-empty Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: COPY --from=ghcr.io/vertti/preflight:latest /preflight /usr/local/bin/preflight RUN preflight cmd config-render --min 1.4.0 RUN preflight env DATABASE_URL RUN preflight file /app/config.yaml --not-empty CODE_BLOCK: COPY --from=ghcr.io/vertti/preflight:latest /preflight /usr/local/bin/preflight RUN preflight cmd config-render --min 1.4.0 RUN preflight env DATABASE_URL RUN preflight file /app/config.yaml --not-empty CODE_BLOCK: preflight cmd config-render --min 1.4.0 preflight env DATABASE_URL Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: preflight cmd config-render --min 1.4.0 preflight env DATABASE_URL CODE_BLOCK: preflight cmd config-render --min 1.4.0 preflight env DATABASE_URL CODE_BLOCK: HEALTHCHECK CMD ["/preflight", "http", "http://localhost:8080/health"] Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: HEALTHCHECK CMD ["/preflight", "http", "http://localhost:8080/health"] CODE_BLOCK: HEALTHCHECK CMD ["/preflight", "http", "http://localhost:8080/health"] - Was it missing from PATH? - Did we copy it into the wrong directory? - Did it lose the executable bit? - Wrong architecture (hello exec format error)? - Or did it run but print something unexpected? - First: “Is it on PATH?” - Then: “Can it execute?” - Then: “What version is it?” - Then: “Is that version acceptable?” - Output parsing breaks when formatting changes (“v1.2.3”, “1.2.3 (build abc)”, extra lines, etc.) - Redirect chains swallow the exact error you needed (permission denied vs exec format error vs missing) - Version comparisons drift into “good enough” - Every repo grows its own flavor of the same fragile scripts - not found in PATH - failed to execute (permission denied / exec format error / etc.) - version too old (with an explicit comparison) - bash vs sh differences (and whatever /bin/sh happens to be today) - GNU vs BSD tool differences (grep, sed, awk behave just differently enough) - Linux vs macOS quirks in CI runners - inconsistent error messaging when commands fail inside pipelines - “This image must contain this helper” - “This version is the minimum supported” - “This env var must be present” - “This config file must exist and be non-empty” - cmd — exists on PATH, executes, extracts version, compares semver - env — required env vars, allowed values / patterns - file — existence, permissions, “not empty”, basic content checks - http / tcp — connectivity with retry + timeout - hash — checksum verification - the check is genuinely trivial - you don’t care about consistent error reporting - you’re happy maintaining the script - you’re copying/downloading helper binaries into images - you want the same behavior in CI and in containers - you need real version constraints - you want consistent output across checks - you want checks that double as executable documentation