Hacker News new | ask | show | jobs
by rockballslab 28 days ago
SSH host key leak via ptrace, and why I almost nuked my prod patching it

---

Last week my automated security scanner flagged CVE-2026-46333, also known as ssh-keysign-pwn. The vulnerability is a ptrace race condition in the Linux kernel that lets any local unprivileged user read the SSH host private keys of the server, and as a bonus, /etc/shadow.

The mechanics are straightforward. When a suid or sgid process exits, there is a brief window during which ptrace() can attach and read its open file descriptors. ssh-keysign is suid. So is chage. An attacker with a local shell account, a tight loop, and a bit of patience can win that race and walk away with your server's SSH identity and your password hashes.

A public PoC dropped the same day as disclosure. The CVSS score is 5.5 (Medium), which badly undersells it. On a server where SSH is your only entry point, leaking the host private key means an attacker who later gets a network position can silently MITM every future connection. No warning. No changed fingerprint.

The kernel patch landed in Ubuntu's linux-generic 6.8.0-117 (USN-8278-1). But applying it requires a reboot. The interim mitigation is a single sysctl:

    echo "kernel.yama.ptrace_scope = 2" | sudo tee /etc/sysctl.d/99-ptrace.conf
    sudo sysctl -p /etc/sysctl.d/99-ptrace.conf
Value 2 means only processes with CAP_SYS_PTRACE can use ptrace. Debuggers stop working for regular users. On a production VPS that is an acceptable tradeoff.

Now for the embarrassing part.

My security report also recommended applying the kernel update and rebooting. I copy-pasted the commands into a terminal without thinking:

    sudo apt-get update && sudo apt-get dist-upgrade -y
    sudo reboot
I caught myself before hitting Enter. The server was running 45 Docker containers: three PostgreSQL databases, two Redis instances, a voice agent, n8n, Typebot, Traefik, Prometheus, Grafana, and several production SaaS apps with live users.

A blind dist-upgrade with -y on a server you have not reviewed in weeks is risky. Packages get added. Packages get removed. Things break in ways that are annoying at 2am. And a reboot without a maintenance window means downtime with zero preparation.

The thing is, the mitigation does not require a reboot at all. sysctl -w applies immediately. I was already protected the moment I ran that one-liner. The reboot could wait for a proper maintenance window.

Two lessons I am taking away.

One: CVSS scores describe the vulnerability, not your exposure. A 5.5 on a server where SSH is the perimeter is not a 5.5 in practice.

Two: "apply patch and reboot" is not a procedure. On a production server with stateful services it is a plan waiting to fail. The question is always: what is the fastest mitigation that requires zero downtime, and when is the maintenance window for the rest.

Check your ptrace_scope:

    sysctl kernel.yama.ptrace_scope
If it returns 0 or 1, you are exposed on any unpatched kernel. Two commands fix it with no reboot required.