Miggo

Thank you!

You're subscribed!

Oops! Something went wrong while submitting the form.

This week at Miggo, I was tasked with detecting two recent Linux local privilege escalation vulnerabilities: CopyFail and DirtyFrag.

Our research team provided me with detailed analysis of how both bugs work, and the exploitation primitive reminded me immediately of Dirty Pipe: splice() combined with a controlled write to a page that is already resident in the page cache.

The detection requirement was specific: high confidence, minimal false positives, and no broad mitigations that would break legitimate workloads or require maintaining binary allowlists (a bad idea).

There is a lot of material to cover. We’ll go through:

Which socket subsystems were vulnerable
A quick look at eBPF LSM
What is CopyFail
What is DirtyFrag
Where most detections fail
How we approached detections outside the box
Conclusion

Also see the sister post on our main website.

The socket subsystems involved

Before getting to the exploits and detection, it helps to understand why these socket subsystems exist and what normal use looks like. Both bugs live in parts of the kernel that are genuinely useful and actively used.

AF_ALG — the kernel crypto API socket interface

AF_ALG sockets were introduced to give userspace a way to reach the kernel’s crypto subsystem without routing data through a separate daemon or copying it into a library.

An application creates a socket in the AF_ALG family, binds it to a named algorithm (e.g., "gcm(aes)"), and then uses standard socket operations - setsockopt, sendmsg, recvmsg - to drive encryption, decryption, and authentication. The kernel does the work; the data never has to leave kernel space for the crypto engine.

Tools that deal with disk encryption, VPNs, or kernel-assisted TLS offload use this path. It keeps keys manageable from userspace while offloading computation. In normal usage, the sequence is straightforward: create, bind, set a valid key once, then operate on data.

The associated authentication data length (AEAD_ASSOCLEN) is also configured once, at session initialization, according to what the protocol requires.

UDP_ENCAP_ESPINUDP — IPsec NAT traversal

IPsec operates at the network layer and, in its default transport mode, is invisible to UDP IPSec sends. The problem is NAT: NAT devices generally cannot handle raw ESP packets (which IPSec is comprised of) because they have no port numbers to track state against.

UDP_ENCAP_ESPINUDP wraps ESP packets inside UDP, which allows NAT traversal to work. You configure this once on a socket that will carry IPsec traffic, set it, and leave it.

‍That means a legitimate VPN daemon or an IPsec stack will create a small number of long-lived sockets with this option set. It will not spray dozens of short-lived sockets and set the same option in rapid succession.

RxRPC — the AFS/Kerberos transport

RxRPC is a reliable UDP-based transport protocol used primarily by AFS (Andrew File System) and, in the Linux kernel, by the kernel’s Kerberos (rxkad) authentication path.

It has its own socket family (AF_RXRPC) and its own security options:

RXRPC_SECURITY_KEY installs a Kerberos ticket for authenticated sessions
RXRPC_MIN_SECURITY_LEVEL enforces the minimum level of protection required on the connection.

Real RxRPC clients — AFS clients, for instance — set these options at a measured pace, once per connection, before communicating with a fileserver. They do not create and configure dozens of authenticated sockets in rapid succession before dropping them.

eBPF LSM: the right attachment point

Why this hook exists

The Linux Security Module framework provides a set of hooks throughout the kernel at security-relevant decision points. These hooks predate eBPF by a significant margin; they were designed to give implementations like SELinux and AppArmor a stable, auditable place to enforce policy.

When the kernel is about to execute a sensitive operation — a setsockopt call, a mmap, a file open - it calls through the LSM hook. SELinux and AppArmor registered their enforcement callbacks there at boot time.

security_socket_setsockopt is one such hook. It fires on every setsockopt call that passes initial validation, receiving the socket, the level, the option name, and the caller’s credentials. SELinux uses it to enforce network socket policy; AppArmor uses it to check socket rules in profiles.

eBPF programs at LSM hooks

Since Linux 5.7, eBPF programs can attach to LSM hooks directly, through the BPF_PROG_TYPE_LSM program type and the BPF_LSM_MAC attach type.

This does not require that an LSM backend (SELinux, AppArmor) be active. The eBPF program is registered separately and runs regardless of whether a traditional MAC system is enabled.

This matters in practice because the majority of general-purpose Linux distributions do not ship with a strict MAC policy enforced by default. Ubuntu, for example, ships AppArmor with a limited set of active profiles.

Fedora ships SELinux in enforcing mode but with a policy that may or may not cover the operations relevant here. In many container and cloud environments, MAC is deliberately disabled for operational simplicity.

This matters in practice because the majority of general-purpose Linux distributions do not ship with a strict MAC policy enforced by default. Ubuntu, for example, ships AppArmor with a limited set of active profiles.

Fedora ships SELinux in enforcing mode but with a policy that may or may not cover the operations relevant here. In many container and cloud environments, MAC is deliberately disabled for operational simplicity.

An eBPF LSM program runs at the same hook regardless. It can observe the call, record state in a BPF map, and emit a signal — without any dependency on a userspace policy daemon or a kernel compiled with a particular CONFIG_DEFAULT_SECURITY.

BPF LSM programs can attach to these hooks on any kernel with CONFIG_BPF_LSM=y and bpf present in the active LSM list, which is a separate condition from BTF availability.

The distinction between detection and enforcement is a matter of program return value.

An eBPF program attached at fexit (the function exit path) or fentry (function entry) observes but cannot alter the return value. A program attached at an LSM hook and returning a negative error code can block the syscall entirely - the same mechanism AppArmor and SELinux use.

Whether to block or only alert is an operational decision.

CopyFail (CVE-2026-31431)

What the exploit does

CopyFail abuses AF_ALG AEAD sockets to corrupt the kernel page cache. The relevant deviation from normal AF_ALG usage comes down to two setsockopt options.

ALG_SET_KEY is normally called once per socket to install a valid cryptographic key. If a real key is set, the kernel enforces authentication over all subsequent operations.

The exploit deliberately avoids this: it either omits the key entirely or sets a null one, which disables authentication and allows the page-cache write to go through without proper validation. Repeated calls to this option across multiple sockets - something no legitimate crypto application would do - are therefore a strong signal.

ALG_SET_AEAD_ASSOCLEN sets the length of the associated authentication data for AEAD operations. Normal protocols require this to match their AAD field, which is typically larger than four bytes.

The exploit sets it to exactly 4. This is sufficient to target the first four bytes of a page-cache page, and that is all the exploit needs for its write primitive.

The other AF_ALG options - ALG_SET_IVLEN, ALG_SET_IV, ALG_SET_OP - are not
useful signals here. They are commonly passed as per-operation control messages through sendmsg, and their presence is normal in legitimate workloads.

OPTION	VALUE	Used in exploit	Normal usage	Detection signal
ALG_SET_KEY	1	Yes	Once per socket	null/weak/repeated
ALG_SET_AEAD_ASSOCLEN	5	Yes	Once, larger value	value == 4, repeated
ALG_SET_IVLEN	3	No (via sendmsg)	Control message	ignore
ALG_SET_IV	2	No (via sendmsg)	Control message	ignore
ALG_SET_OP	4	No (via sendmsg)	Control message	ignore

DirtyFrag (CVE-2026-43284, CVE-2026-43500)

What the exploit does

DirtyFrag reaches the same page-cache corruption primitive as copyfail, but through two
different socket subsystems. Each has its own variant.

XFRM/ESP variant. The exploit creates a large batch of XFRM Security Associations in ESN transport mode, then repeatedly creates UDP sockets and sets UDP_ENCAP_ESPINUDP before splicing file-backed pages into ESP packets.

The key observation: real IPsec/NAT-T code configures a small, stable set of long-lived sockets. The exploit sprays new sockets at high frequency, resetting the same option on each one. That burst cadence per task is the signal.

RxRPC/rxkad variant. The exploit creates many authenticated rxrpc client sockets in rapid succession, setting both RXRPC_SECURITY_KEY and RXRPC_MIN_SECURITY_LEVEL before
attempting to splice a mapped file page as a forged DATA packet.

A real AFS client or kernel Kerberos session establishes a small number of connections at a measured pace. Multiple rapid-fire repetitions of both options within the same task are not a legitimate pattern.

Why most existing detections fall short

Now that we saw both vulnerabilities, let’s start to think outside the box: most runtime detection tools (eBPF based or not) that have addressed these vulnerabilities take one of two approaches.

The first is to flag or block every use of AF_ALG sockets (or UDP_ENCAP, or RXRPC), sometimes combined with an allowlist of known-good binaries.
The second is to detect the presence of specific files being mmap-ed or splice-d. Most exploits use /etc/passwd in some variants, and setuid binaries in others.

Both approaches have the same problem:

They are not specific to the vulnerability. An allowlist for AF_ALG immediately becomes a maintenance liability and a security gap for any new binary that legitimately uses crypto sockets. Blocking the socket family entirely disables real functionality.
Detecting specific file paths creates a detection that is trivially bypassed by pointing the exploit at a different file - because the vulnerable page-cache write path does not care which file backs the target page.

The exploitation primitive in both copyfail and dirtyfrag is the abnormal pattern of socket option configuration: specific options, set with atypical values, in atypical frequency, within the same task. That pattern is what the detections should track.

Detecting CopyFail

The eBPF program attaches at fexit/security_socket_setsockopt (a eBPF LSM hook called in setsockopt) and tracks only the two options above. For each task it maintains a small amount of per-task state: a timestamp and a counter for each option. When either counter exceeds a threshold within a short time window, the task is flagged.

We detect the abnormal use of setsockopt, without overly blocking the entire socket family or relying on a specific target binary to be overwritten.

SEC("fexit/security_socket_setsockopt")
int
BPF_PROG(
    _security_socket_setsockopt, // Executed at "sys_setsockopt".
    struct socket *sock,         // Socket to be setsockopt().
    int            level,        // Level: SOL_ALG.
    int            optname,      // Option: ALG_SET_KEY, ALG_SET_AEAD_ASSOCLEN.
    int            ret           // Return value for the security call.
)
{
    if (level != SOL_ALG)
        return 0;

    switch (optname) {
        case ALG_SET_KEY:
        case ALG_SET_AEAD_ASSOCLEN:
            break;
        default:
            return 0;
    }

    // ...

    copyfail_key_t key = {
        .task_hash = hash,
    };
    copyfail_val_t initval = {
        .last_time      = 0,
        .count_key      = 0,
        .count_assoclen = 0,
        .last_key       = 0,
        .last_assoclen  = 0,
        .exploited      = 0,
    };

    // Load (or initialize) the detection state for this task.
    copyfail_val_t *value = bpf_map_lookup_elem(&det_copy_fail, &key);
    if (!value) {
        bpf_map_update_elem(&det_copy_fail, &key, &initval, BPF_NOEXIST);
        value = bpf_map_lookup_elem(&det_copy_fail, &key);
        logger_goto_if_null(value, aleader, "null copyfail val");
    }

    if (value->exploited) {
        goto end; // Already detected.
    }

    u64 now = bpf_ktime_get_ns();

    switch (optname) {
        case ALG_SET_KEY:
        {
            u64 last = value->last_key;
            value->count_key++;
            value->last_key = now;
            if (last == 0)
                goto end; // First observed call.
            if (value->count_key < 4)
                goto end;
            if (now - last > SEC2) {
                value->count_key = 0;
                goto end;
            }
            break;
        }
        case ALG_SET_AEAD_ASSOCLEN:
        {
            u64 last = value->last_assoclen;
            value->count_assoclen++;
            value->last_assoclen = now;
            if (last == 0)
                goto end;
            if (value->count_assoclen < 4)
                goto end;
            if (now - last > SEC2) {
                value->count_assoclen = 0;
                goto end;
            }
            break;
        }
        default:
            goto end;
    }

    // Exploit detected.
    logger_printk("AF_ALG exploit (pid=%d, count=%llu)", pid, value->count_key);
    value->exploited = 1;

    // ...
}

The program does not filter ALG_SET_AEAD_ASSOCLEN to the specific value of 4. I’ll let the reader to pick a close hook that provides that value (or not). It reduces chances of false positives (even more).

Detecting DirtyFrag

The same fexit/security_socket_setsockopt hook covers both variants. Per-task state tracks call frequency and timing separately for each vector, and an alert fires when counts exceed the expected threshold within the observation window.

Once again, we don’t need to know which binary or file the attacker is interested in. The detection is in the pattern.

SEC("fexit/security_socket_setsockopt")
int
BPF_PROG(
    _security_socket_setsockopt, // Executed at "sys_setsockopt".
    struct socket *sock,         // Socket to be setsockopt().
    int            level,        // Level: SOL_UDP or SOL_RXRPC.
    int            optname,      // Option: UDP_ENCAP, RXRPC_SECURITY_KEY, ...
    int            ret           // Return value for the security call.
)
{
    u64 vector = 0;

    if (level == SOL_UDP && optname == UDP_ENCAP) {
        vector = DIRTYFRAG_VECTOR_ESP;
    }
    else if (
        level == SOL_RXRPC &&
        (optname == RXRPC_SECURITY_KEY || optname == RXRPC_MIN_SECURITY_LEVEL)) {
        vector = DIRTYFRAG_VECTOR_RXRPC;
    }
    else {
        return 0;
    }

    // ...

    dirtyfrag_key_t key     = {.t_hash = aleader->hash};
    dirtyfrag_val_t initval = {
        .last_udp_encap     = 0,
        .last_rxrpc_key     = 0,
        .last_rxrpc_minsec  = 0,
        .count_udp_encap    = 0,
        .count_rxrpc_key    = 0,
        .count_rxrpc_minsec = 0,
        .exploited          = 0,
        .vector             = 0,
    };

    dirtyfrag_val_t *value = bpf_map_lookup_elem(&det_dirtyfrag, &key);
    if (!value) {
        bpf_map_update_elem(&det_dirtyfrag, &key, &initval, BPF_NOEXIST);
        value = bpf_map_lookup_elem(&det_dirtyfrag, &key);
        logger_goto_if_null(value, aleader, "null dirtyfrag val");
    }

    if (value->exploited) {
        goto aleader; // Already detected.
    }

    u64 now = bpf_ktime_get_ns();

    if (vector == DIRTYFRAG_VECTOR_ESP) {
        u64 last = value->last_udp_encap;
        if (last == 0 || now - last > DIRTYFRAG_ESP_WINDOW_NS) {
            value->count_udp_encap = 1;
        }
        else {
            value->count_udp_encap++;
        }
        value->last_udp_encap = now;

        if (value->count_udp_encap < DIRTYFRAG_ESP_THRESHOLD) {
            goto aleader;
        }
    }
    else if (optname == RXRPC_SECURITY_KEY) {
        u64 last = value->last_rxrpc_key;
        if (last == 0 || now - last > DIRTYFRAG_RXRPC_WINDOW_NS) {
            value->count_rxrpc_key = 1;
        }
        else {
            value->count_rxrpc_key++;
        }
        value->last_rxrpc_key = now;

        // Must have observed both RXRPC_SECURITY_KEY and 
        // RXRPC_MIN_SECURITY_LEVEL at sufficient frequency.
        if (value->count_rxrpc_key < DIRTYFRAG_RXRPC_THRESHOLD ||
            value->count_rxrpc_minsec < DIRTYFRAG_RXRPC_THRESHOLD) {
            goto aleader;
        }
    }
    else {
        // Only remaining case: RXRPC_MIN_SECURITY_LEVEL.
        u64 last = value->last_rxrpc_minsec;
        if (last == 0 || now - last > DIRTYFRAG_RXRPC_WINDOW_NS) {
            value->count_rxrpc_minsec = 1;
        }
        else {
            value->count_rxrpc_minsec++;
        }
        value->last_rxrpc_minsec = now;

        if (value->count_rxrpc_key < DIRTYFRAG_RXRPC_THRESHOLD ||
            value->count_rxrpc_minsec < DIRTYFRAG_RXRPC_THRESHOLD) {
            goto aleader;
        }
    }

    logger_printk("DirtyFrag exploit (pid=%d, vector=%llu)", pid, vector);

    value->exploited = 1;
    value->vector    = vector;

    // ...
}

Note: Variable types and threshold constants are inferable from the logic. Calibrating them to the target environment is left to the reader.

Conclusion

Both copyfail (CVE-2026-31431) and dirtyfrag (CVE-2026-43284, CVE-2026-43500) are recent Linux local privilege escalation vulnerabilities that abuse the kernel’s page cache through splice() and specific socket subsystems to land controlled writes into cached file data.

For copyfail, the detection signal is null or repeated ALG_SET_KEY calls paired with ALG_SET_AEAD_ASSOCLEN set to 4. For dirtyfrag, it is burst patterns of UDP_ENCAP configuration, or rapid sequences of RXRPC_SECURITY_KEY and RXRPC_MIN_SECURITY_LEVEL setsockopt calls within the same task.

The eBPF LSM hook at security_socket_setsockopt is the right instrument for both: it sits at the same decision point that AppArmor and SELinux have used for years, fires regardless of whether a traditional MAC policy is active, and can be extended to block rather than only observe by returning an error code from the LSM attach point.

The per-task state tracked in BPF maps is minimal and the overhead is proportional to the frequency of the relevant socket options - which, in normal workloads, is very low.DISCLAIMER:

The eBPF code in this material is intentionally incomplete and serves as a starting point for detecting (and blocking) local privilege escalation bugs using LSM hooks. Production-grade detections should complement these probes with hooks on splice and copy_file_range, correlating page-cache write attempts against the socket configuration state already tracked in the maps - covering variants that pre-configure sockets slowly and trigger the write separately.

Evasion is possible but raises the cost for the attacker. Distributing socket setup across forked children defeats per-task counters; the map key should use process group leader hashing (tgid) or cgroup membership rather than individual task identity. Keeping call cadence just outside the time window defeats the sliding window reset; a leaky-bucket accumulator is more resistant. Both mitigations make the exploit less deterministic and introduce patterns detectable by other means.

Here is a short demo showing both detections in action.

‍

<script src="https://cdn.jsdelivr.net/npm/gsap@3.12.5/dist/gsap.min.js"></script>
<script src="https://cdn.jsdelivr.net/npm/gsap@3.12.5/dist/Flip.min.js"></script>

<script>
  document.addEventListener("DOMContentLoaded", (event) => {
    gsap.registerPlugin(Flip);
    const state = Flip.getState("");
    const element = document.querySelector("");
    element.classList.toggle("");
    Flip.from(state, {
      duration: 0,
      ease: "none",
      absolute: true,
    });
  });
</script>

<script src="https://cdn.jsdelivr.net/npm/gsap@3.12.5/dist/gsap.min.js"></script>
<script src="https://cdn.jsdelivr.net/npm/gsap@3.12.5/dist/Flip.min.js"></script>

<script>
  document.addEventListener("DOMContentLoaded", (event) => {
    gsap.registerPlugin(Flip);
    const state = Flip.getState("");
    const element = document.querySelector("");
    element.classList.toggle("");
    Flip.from(state, {
      duration: 0,
      ease: "none",
      absolute: true,
    });
  });
</script>