The SELinux Breaking Change Nobody's Talking About (Until It Breaks Your Pods)
Admin User
Author
I spent three hours last week debugging why a privileged container couldn't write to a volume that an unprivileged container had created. The volume existed, the permissions looked fine, but everything just... hung. It wasn't until I dug into the SELinux contexts that things clicked. That afternoon taught me something: most of us running Kubernetes on production Linux systems with SELinux enabled are living on borrowed time. And if you're in that camp, Kubernetes v1.37 is coming to collect.
The original issue wasn't even on my radar. I thought SELinux was one of those "turn it off in production" problems that most teams ignore. But after reading about what's coming in Kubernetes, I realized I've been wrong—and so have a lot of us building on Red Hat-based infrastructure.
What's Actually Changing Here
For years, Kubernetes handled SELinux labels reactively. When you launched a Pod with an SELinux context, the container runtime would recursively walk through every file on your volumes and reapply the correct labels. Simple, but expensive. On a volume with millions of files or mounted over NFS, this could take minutes.
The Kubernetes team realized they could leverage kernel-level mount options instead. Modern Linux kernels support passing SELinux contexts directly at mount time—meaning the OS handles labeling automatically without recursive traversal. Faster, cleaner, more efficient.
This landed in v1.28 for ReadWriteOncePod volumes under a feature gate. Now it's graduating to general availability and expanding to cover all volumes. The problem? The new behavior breaks something that used to work silently.
The Breaking Change You Need to Know About
Here's where it gets real: two Pods with different SELinux labels can no longer share the same volume when you enable the new SELinuxMount feature gate.
Previously, this worked because both Pods would get their labels applied recursively. Even if they had different contexts, the filesystem could accommodate both. But with mount-time labeling? The volume gets mounted with a specific context. Only one context. If your second Pod has a different one, it fails.
The same applies to sharing volumes between privileged and unprivileged Pods. That used to work. It won't anymore.
Kubernetes is giving us v1.36 to audit our clusters and fix this before v1.37 makes SELinuxMount default behavior. That's your window.
What I'd Actually Do Right Now
If I had SELinux enforcing on my clusters, here's my action plan:
First, inventory your workloads. Which Pods share volumes? Which ones have different SELinux contexts? I'd write a quick audit script against the Kubernetes API—nothing fancy, just check PersistentVolumeClaims and cross-reference Pods consuming them.
Second, test the new behavior immediately. Don't wait for v1.37. Enable the SELinuxMount feature gate in a dev cluster and see what breaks. Better to find these issues on your terms than in production during an upgrade.
Third, redesign your volume sharing strategy. The clean solution is usually refactoring to use subPaths—mounting different subdirectories of the same volume to different Pods. This worked before and will work now. Alternatively, separate your privileged and unprivileged workloads onto different volumes entirely.
Here's what a safer approach might look like:
apiVersion: v1
kind: Pod
metadata:
name: app-unprivileged
spec:
securityContext:
seLinuxOptions:
level: "s0:c100,c200"
containers:
- name: app
volumeMounts:
- name: shared-data
mountPath: /data/app
subPath: unprivileged # Different subPath prevents conflicts
volumes:
- name: shared-data
persistentVolumeClaim:
claimName: shared-pvc
---
apiVersion: v1
kind: Pod
metadata:
name: app-privileged
spec:
securityContext:
seLinuxOptions:
level: "s0:c300,c400"
privileged: true
containers:
- name: admin
volumeMounts:
- name: shared-data
mountPath: /data/admin
subPath: privileged # Different subPath, different context
volumes:
- name: shared-data
persistentVolumeClaim:
claimName: shared-pvc
This pattern lets you keep sharing infrastructure while respecting the new mount-time labeling constraints.
The Real Question
What bothers me is how many teams won't catch this until it breaks. The change makes logical sense from a performance perspective, but the migration path is genuinely difficult if you've built systems around the old behavior. I'm curious: how many of you are actually running SELinux in enforcing mode on production Kubernetes? And how many know it?
If you're managing infrastructure for any Red Hat-based platform, start the audit now. Don't let v1.37 surprise you.
Source: This post was inspired by "SELinux Volume Label Changes goes GA (and likely implications in v1.37)" by Kubernetes Blog. Read the original article