DevOps & Cloud

The Security Hole in My Monitoring Stack I Didn't Know I Had

A

Admin User

Author

Jun 21, 2026
4 min read
4 views
The Security Hole in My Monitoring Stack I Didn't Know I Had

I was sitting in a security audit last quarter when the consultant casually mentioned that my Prometheus node exporter had "unrestricted container execution capabilities." I laughed. It's just reading metrics. Then she showed me the RBAC role I'd written three years ago, and my stomach dropped.

Like most developers, I've been grant-and-forget with Kubernetes permissions. You need metrics? nodes/proxy. You need logs? nodes/proxy. You need health checks? You guessed it—nodes/proxy. It works, your dashboards light up, and you move on. But Kubernetes v1.36's graduation of fine-grained kubelet authorization to GA is forcing me to reckon with something I've been willfully ignoring: I've been handing God-mode credentials to every monitoring tool in my cluster.

The problem isn't new, but the solution finally feels mature enough to actually use.

The nodes/proxy Trap

Here's what I didn't fully appreciate until I dug into this: the nodes/proxy permission isn't just overly broad—it's a direct path to container compromise. When I grant it to a DaemonSet, I'm not just giving it read access to metrics. I'm giving it the ability to exec into any container on that node.

The worst part? A security researcher proved in early 2026 that even "read-only" nodes/proxy access can be weaponized through WebSocket protocol quirks. The kubelet's GET request for WebSocket handshakes bypasses the CREATE check needed for exec, meaning a compromised monitoring agent can run arbitrary commands at root level. It's not a theoretical attack anymore.

What Fine-Grained Authorization Actually Changes

Kubernetes v1.36 stops treating all kubelet API paths as equal. Now, common operations like reading metrics (/metrics), checking health (/healthz), and listing pods (/pods) each have their own subresources. This means I can finally write RBAC rules that say "you can read metrics, period" instead of "you can do whatever the kubelet allows."

The implementation is clever: the kubelet tries the specific subresource first, then falls back to nodes/proxy for backward compatibility. This matters because it means I don't have to refactor everything overnight. Old workloads keep working while new ones can be locked down from day one.

The built-in system:kubelet-api-admin role gets updated automatically, so cluster admins don't wake up to broken API servers. That's thoughtful design.

My Take: A Security Feature That Actually Feels Practical

I've been burned before by Kubernetes features that solve problems I don't have in the way I don't want them solved. This isn't one of those. Fine-grained kubelet auth solves a real problem I've genuinely created through lazy RBAC practices, and it does it without forcing a flag day migration.

What I like: the backwards compatibility approach means I can adopt this incrementally. What I'm slightly skeptical about: will teams actually take advantage of this, or will everyone just continue using nodes/proxy because it's easier? The burden is now on us to be better.

The Monitoring Agent I'm Rewriting

Here's what I'm moving to in my next deployment:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: prometheus-node-exporter
rules:
- apiGroups: [""]
  resources: ["nodes/metrics", "nodes/stats"]
  verbs: ["get"]
- apiGroups: [""]
  resources: ["nodes/healthz"]
  verbs: ["get"]

Previously, I would've written one rule with nodes/proxy and called it done. Now I'm explicit about what the exporter needs. If Prometheus gets compromised, the attacker gets metrics, not a root shell in every container.

It's the principle of least privilege actually working instead of being a security theater buzzword on a slide deck.

Where I'm Still Uncertain

One question I haven't seen clearly answered: how does this interact with audit logging? If a request hits both the fine-grained check and the fallback, does it show up twice in audit logs? In a high-traffic cluster, that could get noisy fast.

Also, I want to see real-world examples from maintainers running critical infrastructure. Does this work cleanly at scale, or are there edge cases where the fallback behavior creates unexpected access patterns?

The feature is GA now, which means it's production-ready. But I'm going to roll it out cautiously in non-critical clusters first. Not because I distrust the Kubernetes team, but because changing how kubelet authorization works is security-critical enough that I want to see it behave in my own environment first.

What's your current approach to kubelet permissions? Are you still using the broad nodes/proxy, or have you been experimenting with finer-grained controls?


Source: This post was inspired by "Kubernetes v1.36: Fine-Grained Kubelet API Authorization Graduates to GA" by Kubernetes Blog. Read the original article

Share this article

Written by Adil Sher

Full stack developer building high-traffic platforms, AI services, and custom web applications. Explore my portfolio, learn about my background, or get in touch.

Related Articles

The Cache Consistency Problem I Didn't Know I Had (Until It Cost Us)
DevOps & Cloud Jun 20

The Cache Consistency Problem I Didn't Know I Had (Until It Cost Us)

Three months into production, our ReplicaSet controller started behaving like a ghost—taking actions based on information that didn't actually exist anymore. A pod would be deleted, our controller would see the old state in its cache, and we'd recreate it seconds later. Users rep...