A Readiness Probe That Can't Fail Is Just Wallpaper

Writing up a production outage report today, I was making the case that workloads should fail their readiness probes when they can’t reach their dependencies — databases, caches, anything required to do useful work. The collaborating Claude session put it better than I had:

A probe that doesn’t fail when its workload can’t reach its database isn’t a probe — it’s wallpaper.

That’s the whole thing. A readiness probe answers one question: is this pod ready to serve traffic? If the answer depends on a database connection and you’re not checking for that, you’re not answering the question — you’re decorating the pod spec.

Why the shallow probe fails you during an outage

When a required dependency goes down, Kubernetes keeps routing traffic to your pods if their probes stay green. Every request hits a pod that can’t do the work and the user sees the error. You’re now relying on application-layer retries or circuit breakers to save you — neither of which you should need if the probe was doing its job.

A probe that reflects real dependency health lets Kubernetes do what it’s designed for: stop routing traffic to pods that aren’t ready. When the dependency recovers, the probe passes again, traffic resumes, no manual intervention.

What the difference looks like

Shallow probe — passes as long as the HTTP server is up:

readinessProbe:
  httpGet:
    path: /healthz
    port: 8080

Probe that actually answers the question — /ready checks the database connection:

readinessProbe:
  httpGet:
    path: /ready  # verifies DB connection, returns non-2xx if unreachable
    port: 8080
  initialDelaySeconds: 5
  periodSeconds: 10
  failureThreshold: 3

The specific timing values matter less than the endpoint — tune those to your startup time. The /ready handler should try to verify every dependency the pod needs to serve a request. Any of them unreachable, return a non-2xx. That’s it.

The wallpaper version looks like a probe, passes CI, and appears in every health check dashboard. It just doesn’t help when things go wrong — which is the only time probes matter.

Related