# CVE-2026-43503 (DirtyClone) – Root Cause Analysis

## Summary

CVE-2026-43503, nicknamed **DirtyClone**, is a local privilege escalation flaw in the Linux kernel networking stack.  When a socket buffer (skb) carries file-backed page-cache fragments, the `SKBFL_SHARED_FRAG` flag in `skb_shinfo()->flags` tells the XFRM/IPsec receive path that it must copy the data before decrypting in-place.  Several skb fragment-transfer helpers (`__pskb_copy_fclone()`, `skb_shift()`, `skb_gro_receive()`, `skb_gro_receive_list()`, `tcp_clone_payload()`, and `skb_segment()`) fail to copy that flag to the new skb.  A netfilter `TEE` clone therefore keeps the original page-cache reference but no longer reports itself as shared/file-backed, so `esp_input()` decrypts directly into the page-cache page of a root-owned setuid binary.  By choosing the AES-CBC key/IV so the decrypted bytes are attacker-controlled shellcode, the attacker rewrites e.g. `/usr/bin/su` in RAM and obtains a root shell when the binary is executed.

## Impact

- **Package/component affected:** Linux kernel networking stack — specifically the skb fragment-copy helpers and the XFRM/IPsec `esp_input()` path.
- **Affected versions:** Mainline Linux before v7.1-rc5; stable branches before 5.10.257, 5.15.208, 6.1.174, 6.6.141, 6.12.91, 6.18.33, and 7.0.10.
- **Risk level and consequences:** Local privilege escalation.  An unprivileged local user with the ability to create user and network namespaces can gain root code execution by corrupting the page cache of a setuid root binary.  The bug is reachable from the normal networking paths used by XFRM/IPsec and netfilter, so no special hardware or third-party modules are required beyond the standard kernel configuration.

## Impact Parity

- **Disclosed/claimed maximum impact:** Local privilege escalation (`privilege_escalation`) — unprivileged attacker writes into a root-owned read-only binary's page cache and gains root.
- **Reproduced impact from this run:** Full privilege escalation demonstrated.  The reproducer runs as uid 1000, modifies `/usr/bin/su` in the page cache, and the subsequent `echo id | /usr/bin/su` as the same uid 1000 user prints `uid=0(root) gid=0(root) groups=0(root)`.
- **Parity:** `full`.
- **Not demonstrated:** N/A — the claimed root shell was obtained.

## Root Cause

The bug is in the skb fragment-copy routines.  `__pskb_copy_fclone()` (and the other helpers listed above) copy the fragment array and page references from one skb to another, but they do not propagate the `SKBFL_SHARED_FRAG` flag from the source `skb_shinfo()->flags`.  This flag is set by the DirtyFrag splice fix (`f4c50a4034e6`) whenever an skb carries file-backed page-cache fragments via `vmsplice()`/splice()` into a socket.

When the netfilter `TEE` target clones an outbound ESP-in-UDP packet, the clone goes through `nf_dup_ipv4()` → `__pskb_copy_fclone()`.  The clone keeps a reference to the same physical page-cache page but is no longer marked `SKBFL_SHARED_FRAG`.  On the receive side, `esp_input()` calls `skb_cow_data()` which checks `skb_has_shared_frag()`.  Because the flag is missing, the cow path is skipped and the in-place AES-CBC decryption writes attacker-controlled bytes into the file-backed page-cache page.

The upstream fix is commit `48f6a5356a33dd78e7144ae1faef95ffc990aae0` (first tag v7.1-rc5), which propagates the `SKBFL_SHARED_FRAG` flag across the fragment-transfer helpers.  Ubuntu mainline v7.0.10 contains the backport, so it serves as the negative control in this reproduction.

## Reproduction Steps

The reproduction is fully automated by `bundle/repro/reproduction_steps.sh`.  In short, it:

1. Installs QEMU, busybox-static, cpio, gcc, coreutils, and jq if they are not present.
2. Uses the prepared project cache to obtain the Ubuntu mainline v7.0.9 (vulnerable) and v7.0.10 (fixed) kernels plus matching rootfs images.
3. Compiles the rafaeldtinoco `dirtyclone.c` exploit and a small `runas` helper that drops to uid 1000.
4. Builds a custom initramfs that:
   - mounts the rootfs read-only,
   - switches root into the Ubuntu rootfs,
   - loads `xt_TEE`, `nf_dup_ipv4`, `esp4`, and `xfrm_user`,
   - runs the exploit as an unprivileged uid 1000 user,
   - executes `echo id | /usr/bin/su` as the same uid 1000 user.
5. Boots each kernel/rootfs pair in QEMU with the initramfs and captures the console output.
6. Checks that the vulnerable run prints `LPE_SUCCESS` and `uid=0(root)`, while the fixed run prints `LPE_FAIL` and `su: Authentication failure`.
7. Writes `bundle/repro/runtime_manifest.json` and `bundle/repro/validation_verdict.json`.

### Expected evidence

- `bundle/logs/qemu_vuln.log` must contain:
  - `Linux (none) 7.0.9-070009-generic ...`,
  - `[dc] wrote 192 bytes to /usr/bin/su starting at 0x0`,
  - `uid=0(root) gid=0(root) groups=0(root)` from the `id` command,
  - `LPE_SUCCESS: unprivileged user got root`.
- `bundle/logs/qemu_fixed.log` must contain:
  - `Linux (none) 7.0.10-070010-generic ...`,
  - `[dc] post-write verify failed (target unchanged)`,
  - `Password: su: Authentication failure`,
  - `LPE_FAIL`.

## Evidence

- `bundle/logs/reproduction_steps.log` — high-level script progress.
- `bundle/logs/qemu_vuln.log` — full console capture of the vulnerable kernel run showing the root shell.
- `bundle/logs/qemu_fixed.log` — full console capture of the fixed kernel run showing the exploit is blocked.
- `bundle/repro/runtime_manifest.json` — runtime evidence manifest.
- `bundle/repro/validation_verdict.json` — structured verdict (`claim_outcome: confirmed`).

Key excerpts from the vulnerable run:

```
[dc] cmd: iptables -t mangle -A OUTPUT -p udp --dport 4500 -j TEE --gateway 10.99.0.2 -> 0
[dc] installed 48 xfrm SAs
[dc] wrote 192 bytes to /usr/bin/su starting at 0x0
[dc] /usr/bin/su page-cache patched (entry 0x78 = shellcode)
=== LPE check as uid 1000 ===
uid=0(root) gid=0(root) groups=0(root)
LPE_SUCCESS: unprivileged user got root
```

Key excerpts from the fixed run:

```
[dc] wrote 192 bytes to /usr/bin/su starting at 0x0
[dc] post-write verify failed (target unchanged)
=== LPE check as uid 1000 ===
Password: su: Authentication failure
LPE_FAIL
```

The kernel configuration required for the path is present in both Ubuntu mainline builds: `CONFIG_XFRM`, `CONFIG_INET_ESP`, `CONFIG_NETFILTER_XT_TARGET_TEE`, and user namespaces.

## Recommendations / Next Steps

- **Upgrade:** Apply the upstream fix `48f6a5356a33` (or the corresponding stable backport).  Ubuntu mainline v7.0.10 and later, and mainline v7.1-rc5 and later, are patched.
- **Mitigation until patched:** Disable unprivileged user namespaces (`kernel.unprivileged_userns_clone=0`) or prevent loading of `xt_TEE` / `esp4` / `nf_dup_ipv4` for untrusted users, since the exploit requires `CAP_NET_ADMIN` in a private network namespace.
- **Testing:** Backport validation should verify that `SKBFL_SHARED_FRAG` survives `__pskb_copy_fclone()`, `skb_shift()`, `skb_gro_receive()`, `skb_gro_receive_list()`, `tcp_clone_payload()`, and `skb_segment()` by running this reproducer against the candidate kernel.

## Additional Notes

- **Idempotency:** The script was run twice consecutively from a clean state and produced the same confirmed verdict both times.
- **Isolation:** The rootfs disk is mounted read-only inside the VM and the QEMU drive is opened with `readonly=on`, so the host-side image files are not modified by the exploit.
- **Limitations:** The reproduction depends on the prebuilt Ubuntu mainline kernels and rootfs in the project cache.  The exploit path uses `udp/4500` ESP-in-UDP encapsulation and a netfilter `mangle/OUTPUT` TEE rule to force the `__pskb_copy_fclone()` clone; any kernel configuration that lacks one of these components will fail closed, which is expected behavior for an incomplete path.
