Nobody cares that your Kubernetes cluster is healthy (and what to measure instead)
·8 mins
A few weeks ago, our new principal engineer sat down with our team and said something that stung a little: “I can see your cluster is up. I have no idea if anyone finds it useful.”
That’s a hard sentence to sit with when you’ve spent months tuning alerts and building dashboards.
I manage a team of SREs. We look after EKS, ArgoCD, Loki, Backstage, Karpenter, and a handful of other tools that together form what we loosely call “the platform.” We’re good at keeping things running. We have alerts. We have runbooks. We have dashboards full of green lights.