The Kubernetes Security Policy I Can't Accidentally Delete (And Why That Matters)

Last month, I was helping a client recover from what should have been a routine cluster upgrade. Their platform team had defined ValidatingAdmissionPolicies to enforce image registries and prevent privileged containers. Good security posture. But somewhere in the chaos of a botched etcd restore, someone—I won't say who—accidentally deleted the core policy resources. For about forty minutes, their cluster had zero admission controls running. No one noticed until the logs showed a pod pulling from an untrusted registry.

That's when I realized we've been solving the wrong problem. We've spent years building more sophisticated admission control mechanisms, but we never really solved the fundamental issue: how do you make policies that can't be accidentally (or maliciously) removed? Kubernetes v1.36's manifest-based admission control is a direct answer to this problem, and honestly, it's the kind of feature that makes you wonder why it took this long to arrive.

The Bootstrap Problem That's Been Nagging Us

Here's what happens in Kubernetes cluster operations today. Your API server starts, it begins accepting requests, and then—if everything goes right—your ValidatingAdmissionPolicies and webhook configurations eventually show up in etcd and activate. If everything doesn't go right, you have a window where your cluster is running completely unguarded.

This isn't theoretical. I've seen it happen during disaster recovery scenarios, fresh cluster provisioning, and even during routine restarts. That window is usually seconds, but in a high-availability cluster, seconds can be dangerous. More importantly, there's no way to close that window entirely with API-based policies.

The deeper issue is self-referential. An admission webhook can't protect itself from deletion because that would create a circular dependency—the webhook would need to intercept changes to its own configuration, which would require it to already be running. Kubernetes explicitly forbids webhooks from being invoked on admission configuration resources for exactly this reason.

How Manifest-Based Admission Control Actually Works

The concept is surprisingly straightforward. Instead of creating admission policies as API objects, you define them as YAML files on disk and point the API server to a directory. The API server loads these files during startup, before it accepts any requests. Your policies are live from second zero.

apiVersion: apiserver.config.k8s.io/v1
kind: AdmissionConfiguration
plugins:
- name: ValidatingAdmissionPolicy
  configuration:
    apiVersion: apiserver.config.k8s.io/v1
    kind: ValidatingAdmissionPolicyConfiguration
    staticManifestsDir: "/etc/kubernetes/admission/policies/"

One constraint I appreciate: all manifest-defined resources must have names ending in .static.k8s.io. This isn't just namespacing—it's an audit trail. When you're investigating why a request was denied, you immediately know whether it came from a static policy or an API-based one. That's good operational thinking.

The real innovation is that static policies can intercept operations on other admission resources. Since there's no circular dependency (the recovery path is filesystem-based, not API-based), you can write a static policy that prevents deletion of your critical API-based policies. You're essentially creating an admin-proof baseline.

My Take: It's Useful But Raises Questions

I like this feature. The problem it solves is real, and I've felt that pain in production. But I have some reservations.

First, this shifts operational complexity to the filesystem. Now cluster operators need to manage files on disk across multiple API server instances, keep them in sync, version control them, and handle restarts. That's different from managing API objects. It's not harder, necessarily, but it's different. You'll need updated deployment tooling and runbook procedures. My team's spent years optimizing our Kubernetes deployment pipeline around API objects—this requires rethinking part of that.

Second, I'm concerned about the debugging experience. When a request gets denied by a static policy, your typical kubectl describe workflows don't help. You're debugging YAML files on disk, not API resources. For platform teams, this is fine—they own the cluster and its filesystem. For multi-tenant scenarios, I'd want stronger audit logging and metric distinctions between static and API-based denials.

Third, this creates a two-tier system. Some policies are protected, some aren't. That's intentional, but it means you need clear governance about what lives in static manifests and what lives in the API. I'd want to see strong conventions around this.

The Real Value Is Operational Certainty

Where this really shines is platform teams running shared clusters. You can now guarantee that baseline security policies—deny privileged containers, enforce image registries, prevent certain configurations—genuinely can't be accidentally removed. Not by a script, not by a tired operator, not by a buggy automation job. That's worth something.

For my work, this means our disaster recovery procedures just got simpler. We can restore clusters knowing that core policies will be active from startup, no gap, no manual remediation.

What Would You Do?

If you're running Kubernetes in production, I'm curious: would you move your core policies to static manifests? Or do you prefer the flexibility of managing everything through the API? Let me know what your team thinks—the operational patterns here are still being established.

Source: This post was inspired by "Kubernetes v1.36: Admission Policies That Can't Be Deleted" by Kubernetes Blog. Read the original article

The Kubernetes Security Policy I Can't Accidentally Delete (And Why That Matters)

The Bootstrap Problem That's Been Nagging Us

How Manifest-Based Admission Control Actually Works

My Take: It's Useful But Raises Questions

The Real Value Is Operational Certainty

What Would You Do?

Share this article

Related Articles

Kubernetes v1.36 Made Me Rethink How We Authorize Kubelet Access

Why I Finally Stopped Fighting Kubernetes UIs (and Why You Should Care)

Why Kubernetes Just Made My Vulnerability Scanner Useless (And That's Actually Good)