Episode 20

The Tuesday

June 15, 2026 · 16:15

Listen on

Linux kernel maintainers are floating a proposal that would let admins disable vulnerable kernel functions at runtime. The feature is called Killswitch, and the patch was submitted in early May by Sasha Levin, a distinguished engineer at Nvidia and co-maintainer of the long-term support and stable Linux kernel trees. The Register covered it. The pitch is straightforward — when a serious vulnerability drops and patches aren’t ready, instead of waiting for the build-distribute-reboot cycle, you flip a switch and the buggy function refuses to run.

The proposal arrived after a rough stretch for Linux. CopyFail (CVE-2026-31431) dropped, went from disclosure to active exploitation in days. Dirty Frag landed with public exploit code targeting the IPsec ESP and RxRPC subsystems and no official fix at the time of disclosure. The kernel community is now openly discussing whether broken functionality might be preferable to weaponized functionality. Red Hat is on record supporting the idea; the security forums are calling it “terrifying” and “absolutely ridiculous”; both reactions are defensible from where the people saying them are sitting.

The panel’s argument lands somewhere close to four positions held simultaneously. The mechanism isn’t new — Solaris had psradm, IBM had dynamic LPAR reconfiguration, AIX had rmdev, every generation of enterprise Unix shipped a version of “turn off the broken thing at runtime.” The threat model is real but the larger threat is operational, not adversarial — the Tuesday-afternoon mis-toggle that breaks production six hours into a six-hour diagnosis is more likely than the APT using Killswitch as a defense-evasion primitive. The proposal is the right answer to the problem the kernel community is actually facing — patch pipelines cannot keep up with disclosure pipelines, and that’s a structural admission worth sitting with. And the feature will be implemented badly in its first version, get an audit trail by its third, become a NIST control by its eighth, and by the time it’s a NIST control nobody will remember it was supposed to be an emergency mechanism. That arc is the show.

Topics

The Killswitch proposal: Sasha Levin’s patch to let admins disable vulnerable kernel functions at runtime
CopyFail (CVE-2026-31431) and Dirty Frag (CVE-2026-43284, CVE-2026-43500) as the disclosure-pipeline pressure that triggered the proposal
The historical precedent: psradm on Solaris, dynamic LPAR reconfiguration on IBM, rmdev on AIX — runtime-disable mechanisms older than Linux itself
Why the kernel community is formalizing now what admins have done informally for thirty years — the operational scale shift from forty hosts to forty thousand
The CISO threat model — Killswitch as a defense-evasion primitive in the MITRE ATT&CK T1562 sense, where the cleanup-after-compromise step looks like normal admin activity
The actual failure mode — not the APT, the Tuesday — the mis-toggle that breaks production with a syscall returning EINVAL instead of segfaulting
Long-lived configuration drift as the third abuse vector — the function disabled in an incident in May that’s still off eighteen months later, with the team rotated out
The fleet-manager-versus-human-admin transition — and why the human who read the man page in 2002 is not the operator running the runbook in 2026
The four-stage arc of every emergency mechanism in computing history: ships, adopted in first crisis, abused in second, normal by third, forgotten by fifth
What Killswitch signals about the patch pipeline versus the disclosure pipeline — the structural admission that the defenders are losing the foot race

Goat List Reasons referenced

#50 — Goat security is applied completely, thoroughly, and with all the features you’ll ever need, using a stake and a rope.
#32 — A goat has all the patches it will ever need.
#5 — When a goat goes down, you just bury it and buy a new goat.

Source Article

Linux kernel maintainers pitch emergency killswitch after CopyFail and Dirty Frag chaos — The Register, May 11, 2026. Reporting on Sasha Levin’s proposal to introduce a runtime kill switch for vulnerable kernel functions, the operational context provided by the CopyFail and Dirty Frag vulnerabilities, the divided industry response (Red Hat supportive, security-forum users skeptical), and the broader question of whether the patch pipeline can keep up with the disclosure pipeline in an era of accelerated exploit weaponization.

Additional context from secondary coverage at CSOonline (Linux kernel maintainers suggest a ‘kill switch’ to protect systems until a zero-day vulnerability is patched), including the Red Hat statement from VP of core platforms Mike McGrath and skeptical analysis from DeepCove CyberSecurity’s Kellman Meghu and Enderle Group’s Robert Enderle. Primary kernel mailing list discussion is at lore.kernel.org.

Panel

The Legacy Sysadmin
The Paranoid CISO
The DBA
The Goat Farmer’s Counsel

Transcript

Full episode transcript

HOST: Welcome back to Stake and Rope, from Goat Security. Today: Linux kernel maintainers are floating a proposal that would let admins disable vulnerable kernel functions at runtime. The feature is called Killswitch, and the patch was submitted last month by Sasha Levin, a stable kernel co-maintainer at Nvidia. The pitch is straightforward: when a serious vulnerability drops and patches aren’t ready, instead of waiting for the build-distribute-reboot cycle, you flip a switch and the buggy function just refuses to run. This comes after a rough two weeks for Linux — CopyFail dropped, went from disclosure to active exploitation in days, and then Dirty Frag landed with public exploit code and no official fix. So the kernel community is now openly discussing whether broken functionality might be preferable to weaponized functionality.

HOST: On the panel: the Legacy Sysadmin, who has seen the emergency-switch pattern before and will tell us where. The Paranoid CISO, who I assume already has thoughts about a mitigation that is also an attack surface. The DBA, who will tell us what happens when this ships and meets production. Goat Farmer’s here too. Legacy, you first. Emergency killswitch for a running kernel. What does this remind you of?

LEGACY SYSADMIN: [sighs] It doesn’t remind me of anything. I lived it. Solaris had a feature called “psradm” that would let you take a CPU offline at runtime. IBM had dynamic LPAR reconfig that could rip a memory bank out of a running partition. AIX had something called “rmdev” that would let you yank a device driver out of the kernel while the system was up. The pattern is older than Linux. The pattern is older than most of the people writing about Linux.

LEGACY SYSADMIN: What’s new here is the framing. They’re calling it a security feature. The old ones were called “service features.” You used them so a field engineer could replace a board without scheduling downtime. Same mechanism. Different marketing.

THE DBA: Who owns the decision to flip it?

LEGACY SYSADMIN: That’s the question.

THE DBA: Because in 1998 it was the field engineer with the badge. In 2026 it’s whoever has root on a Tuesday afternoon.

HOST: CISO, what’s the threat model here? You’ve got a feature that, by design, disables kernel functionality at runtime. Walk me through it.

PARANOID CISO: The surface read is that this is a defensive mitigation. The deeper read is that it’s a new privileged interface with the ability to alter kernel behavior in ways that don’t generate the usual signals. Think about what the detection story looks like. A normal kernel exploit attempts a syscall, hits the vulnerable function, and somewhere in your stack — eBPF, syscall auditing, EDR — there’s a chance you catch the attempt. With Killswitch active, the call returns failure cleanly. Same return path as a thousand other benign errors.

PARANOID CISO: So now you have a mechanism that can be used to silently disable detection capabilities. An adversary with root — which is admittedly already a bad day — can disable the function that would have caught their next move. MITRE ATT&CK T1562, impair defenses. Killswitch becomes a tool for the same.

HOST: Hold on. The adversary already has root in that scenario. What does Killswitch actually give them they didn’t have?

PARANOID CISO: Plausibility. An EDR alert showing a kernel module being unloaded is loud. An admin-flipped killswitch toggle is a sysadmin doing their job. The signal-to-noise changes.

THE DBA: [scoffs] That’s a real concern. But it’s not the concern.

PARANOID CISO: Go ahead.

THE DBA: The concern is that you ship this, and six months later there’s a Friday afternoon outage, and the post-mortem says someone toggled the killswitch on the wrong subsystem during the CopyFail-3 disclosure, and nobody noticed until the storage layer started returning EINVAL on every other write. That’s the failure mode. Not the APT. The Tuesday.

LEGACY SYSADMIN: He’s right. Dynamic LPAR took down more production systems in the early 2000s than any vulnerability did. Not because the feature was broken. Because the feature worked, and people used it, and people are not careful.

HOST: Legacy, I want to push on that. You said this isn’t new. Are you saying the proposal is bad, or are you saying it’s just not novel?

LEGACY SYSADMIN: I’m saying it’s not novel. The proposal is fine. The proposal is, in fact, the obvious answer to the problem they’re describing. CopyFail went weaponized in three days. Dirty Frag dropped with public exploit code. You can’t reboot a fleet of forty thousand servers in three days. You can’t even survey a fleet of forty thousand servers in three days. So you need a thing that breaks the vulnerable path without requiring a full kernel rebuild. That’s what they’re shipping. It’s the right answer. It’s just an old answer.

GOAT FARMER: Had that one in ‘04.

HOST: What were you running?

GOAT FARMER: Solaris 8 on E10Ks. Patch landed Friday. Couldn’t take the cluster down till the next maintenance window. Two weeks. We turned off the daemon. Same idea.

GOAT FARMER:

Reason number 50. Goat security is applied completely, thoroughly, and with all the features you’ll ever need, using a stake and a rope.

HOST: Standard issue. CISO, let me come back to you. The article mentions developer debate around abuse potential. Where does the actual abuse vector sit, in your read?

PARANOID CISO: Three places. First, the one I already raised — defense evasion post-compromise. Second, supply chain. The killswitch interface is going to be exposed somewhere. A config file, a sysfs entry, a kernel command-line parameter. Whatever it is becomes a target for any privilege-escalation chain that wants to also disable the mitigations applied after the previous privilege-escalation chain. You’re stacking primitives.

PARANOID CISO: Third, and this is the one I’m watching most carefully — the feature creates a new class of long-lived configuration drift. A function gets disabled during an incident in May. The patch ships in June. Nobody re-enables it because nobody remembers. Eighteen months later, that subsystem is permanently off in production, and the team that owns it has rotated out. [pause] You don’t have a vulnerable system. You have a system with a structural hole that the security team isn’t tracking because it doesn’t look like a vulnerability anymore. It looks like a configuration choice.

THE DBA: That’s the one. That’s the actual problem.

LEGACY SYSADMIN: That’s how you end up with NIS still running in 2019.

THE DBA: That’s how you end up with twenty things still running in 2026 that nobody can explain.

HOST: Let me pull back for a second, because I think this connects to something larger. We’ve been talking about this as a kernel feature. But it’s really a workflow change. Legacy, the kernel community is proposing to formalize what admins have done informally for thirty years — turn off the broken thing until you can fix it. Why is that worth formalizing now?

LEGACY SYSADMIN: Because the informal version doesn’t scale. In 1998, “turn it off” meant SSH into the box, edit a config file, restart a daemon. Three minutes per host. You had forty hosts. Two hours of work, done before lunch. In 2026, you have forty thousand hosts, you don’t have shell access to most of them, the config is managed by something that may or may not be running, and the “turn it off” step is a different procedure on every fleet you own. The kernel community is acknowledging that the operational reality changed and the mechanism needs to change with it.

LEGACY SYSADMIN: The old way assumed an admin who knew the system. The new way assumes a fleet manager pushing a config change. Killswitch is for the fleet manager. The original turn-it-off pattern was for the human. That’s the actual transition that’s happening here, and it’s been happening for fifteen years. This is just one more place it’s showing up.

HOST: That’s a useful frame. DBA, does that match what you see?

THE DBA: Mostly. The piece he’s leaving out is that the fleet manager doesn’t know what the function does. The human who ran rmdev in 2002 had read the man page. The fleet manager has a list of functions to disable that came from a vendor security advisory, and the operator running the fleet manager is following a runbook. Nobody in that chain has read the kernel source. When something downstream breaks, nobody in that chain knows why.

THE DBA: That’s fine when it works. It’s catastrophic when it doesn’t, because the diagnostic path runs through three layers of automation that weren’t designed to be diagnostic.

HOST: Right. So now I want to ask the obvious question. Is this a good idea?

LEGACY SYSADMIN: Yes.

PARANOID CISO: Conditionally.

THE DBA: It’s the right idea. It will be implemented badly.

HOST: Three answers. CISO, give me the condition.

PARANOID CISO: The condition is auditability. Every toggle generates a signed, immutable log entry that goes to a SIEM the team that owns the host doesn’t control. Every toggle has a TTL — it auto-expires in seventy-two hours unless explicitly extended. Every toggle requires a documented CVE reference. If you build it that way, the abuse surface I described shrinks substantially. If you build it as a sysfs knob with no audit trail, you have shipped a footgun.

THE DBA: They’re not going to build it that way.

PARANOID CISO: I know.

THE DBA: The first version will be a sysfs knob.

LEGACY SYSADMIN: The second version will have logging. The third version will have TTLs. By the fourth version it’ll be a NIST control. We’ll get there. It’ll take eight years.

HOST: Goat Farmer, you’ve been listening. Anything?

GOAT FARMER:

Reason number 32. A goat has all the patches it will ever need.

HOST: Okay. Let’s land the plane. Closing thoughts. Goat Farmer first.

GOAT FARMER:

Reason number 5. When a goat goes down, you just bury it and buy a new goat.

GOAT FARMER: [pause] I don’t miss any of it.

HOST: Legacy.

LEGACY SYSADMIN: [sighs] I want to be careful here, because the thing I’m about to say sounds dismissive and it isn’t. Every emergency mechanism in computing history has the same arc. It ships because the existing process is too slow. It gets adopted enthusiastically during the first crisis. It gets abused during the second crisis. It becomes operationally normal by the third crisis. And by the fifth crisis, nobody remembers it was supposed to be an emergency mechanism — it’s just part of how the system runs. Killswitch will follow that arc. The patch is the right patch. The future is the future. I’m not against it. I’ve just seen this movie six times, and I know where the third act goes.

HOST: CISO.

PARANOID CISO: The thing that concerns me most about Killswitch isn’t the feature itself. It’s what it signals about where we’ve arrived. The kernel community is acknowledging that the patch pipeline cannot keep up with the disclosure pipeline. That’s a structural admission. It means the model we’ve been running on for twenty years — coordinated disclosure, downstream packaging, customer reboot — is no longer reliably faster than weaponization. We’re now building features that assume the defenders are losing the foot race. [pause] Killswitch is a reasonable response to that reality. But the reality is the story. Not the feature.

HOST: DBA, close it out.

THE DBA: [exhales] Here’s what’s going to happen. The patch lands in a stable kernel by the end of the year. Enterprise distributions backport it sometime next year. A major CVE drops in 2028 — call it CopyFail-3 — and a handful of security teams flip the killswitch on a function nobody fully understands. Most of them are fine. Two of them break production. One of those two takes six hours to diagnose because the failure mode is a syscall returning EINVAL instead of segfaulting, and nothing in their observability stack flags EINVAL as anomalous. The post-mortem says they need better runbooks. They write better runbooks. The runbooks sit in a Confluence page that nobody reads. The next time it happens, it happens the same way.

THE DBA: The feature is correct. The feature will work. The operational layer around the feature is what fails, and the operational layer is what nobody’s funding. [pause] That’s not a kernel problem. That’s everything else.

HOST: Broken functionality preferable to weaponized functionality, until broken functionality becomes the functionality. We’ll see you next time.

← All episodes Episode on rss.com →