# CVE-2026-43456 — Patch / Fix Analysis (Variant Stage)

## 1. The fix under analysis

Upstream commit **`950803f7254721c1c15858fbbfae3deaaeeecb11`**
(`bonding: fix type confusion in bond_setup_by_slave()`, author Jiayuan Chen,
linked by Paolo Abeni). Stable backports referenced by the CVE:

- `6ac890f1d60ac3707ee8dae15a67d9a833e49956`
- `950803f7254721c1c15858fbbfae3deaaeeecb11`  (mainline fix analyzed here)
- `95597d11dc8bddb2b9a051c9232000bfbb5e43ba`
- `9baf26a91565b7bb2b1d9f99aaf884a2b28c2f6d`

Fixes: `1284cd3a2b74` ("bonding: two small fixes for IPoIB support") — the
commit that introduced `bond_setup_by_slave()` copying the slave's
`header_ops` verbatim onto the bond.

## 2. What the fix changes (files / functions / logic)

Single file: `drivers/net/bonding/bond_main.c`. Two additions, one edit.

### 2.1 New wrapper callbacks

```c
static int bond_header_create(struct sk_buff *skb, struct net_device *bond_dev,
                              unsigned short type, const void *daddr,
                              const void *saddr, unsigned int len)
{
        struct bonding *bond = netdev_priv(bond_dev);
        const struct header_ops *slave_ops;
        struct slave *slave;
        int ret = 0;

        rcu_read_lock();
        slave = rcu_dereference(bond->curr_active_slave);
        if (slave) {
                slave_ops = READ_ONCE(slave->dev->header_ops);
                if (slave_ops && slave_ops->create)
                        ret = slave_ops->create(skb, slave->dev,
                                                type, daddr, saddr, len);
        }
        rcu_read_unlock();
        return ret;
}

static int bond_header_parse(const struct sk_buff *skb, unsigned char *haddr)
{
        struct bonding *bond = netdev_priv(skb->dev);
        const struct header_ops *slave_ops;
        struct slave *slave;
        int ret = 0;

        rcu_read_lock();
        slave = rcu_dereference(bond->curr_active_slave);
        if (slave) {
                slave_ops = READ_ONCE(slave->dev->header_ops);
                if (slave_ops && slave_ops->parse)
                        ret = slave_ops->parse(skb, haddr);
        }
        rcu_read_unlock();
        return ret;
}

static const struct header_ops bond_header_ops = {
        .create = bond_header_create,
        .parse  = bond_header_parse,
};
```

Key mechanism: the wrapper delegates to the **active slave's own device**
(`slave->dev`), so when the slave's header callback calls `netdev_priv(dev)`
it receives the slave's correct private struct (e.g. `struct ip_tunnel` for
GRE, `struct ip6_tnl` for ip6gre) — not `struct bonding`.

### 2.2 The one-line edit in `bond_setup_by_slave()`

Before:
```c
bond_dev->header_ops = slave_dev->header_ops;
```
After:
```c
bond_dev->header_ops = slave_dev->header_ops ?
                       &bond_header_ops : NULL;
```

So whenever a non-Ethernet slave re-types the bond, the bond no longer inherits
the slave's `header_ops` pointer; it gets the static `bond_header_ops`
wrapper (or NULL if the slave has none).

### 2.3 Calling context of `bond_setup_by_slave()`

`bond_enslave()` calls `bond_setup_by_slave(bond_dev, slave_dev)` **only** when
all of the following hold (bond_main.c ~line 1955-1972):

1. `!bond_has_slaves(bond)` — this is the **first** slave;
2. `bond_dev->type != slave_dev->type` — the slave's type differs from the
   bond's default `ARPHRD_ETHER`;
3. `slave_dev->type != ARPHRD_ETHER` — the slave is non-Ethernet;
4. (and `BOND_MODE_8023AD` is rejected for non-Ethernet slaves just above.)

For an Ethernet first slave, `bond_ether_setup()` runs instead (sets
`eth_header_ops`, which is safe — `eth_header` uses `dev->dev_addr`, not a
tunnel-private struct). A second slave of a differing type is rejected with
`-EINVAL`. Therefore **`bond_setup_by_slave()` is the only path that ever
assigns a non-Ethernet slave's `header_ops` lineage to the bond**, and the fix
is applied exactly there. `grep` across `drivers/net/bonding/` confirms
`bond_setup_by_slave()` is the only writer of `bond_dev->header_ops` for the
non-Ethernet case.

## 3. What the `struct header_ops` looks like and what the fix wraps

`include/linux/netdevice.h`:

```c
struct header_ops {
        int     (*create)         (struct sk_buff *, struct net_device *, unsigned short, const void *, const void *, unsigned int);
        int     (*parse)          (const struct sk_buff *, unsigned char *);
        int     (*cache)          (const struct neighbour *, struct hh_cache *, __be16);
        void    (*cache_update)   (struct hh_cache *, const struct net_device *, const unsigned char *);
        bool    (*validate)       (const char *, unsigned int);
        __be16  (*parse_protocol) (const struct sk_buff *);
};
```

Six callbacks. **The fix's `bond_header_ops` only wraps `.create` and
`.parse`.** The other four (`.cache`, `.cache_update`, `.validate`,
`.parse_protocol`) are left NULL on the bond.

This is **safe**, not a gap: because the fix *replaces* `bond_dev->header_ops`
entirely with `&bond_header_ops` (rather than merging fields), the slave's
`.cache`/`.cache_update`/`.validate`/`.parse_protocol` are simply **not
present** on the bond and are never invoked through it. All in-kernel callers
of these callbacks are NULL-guarded:

- `dev_hard_header()` → `ops->create` (wrapped);
- `dev_header_parse()` / `dev_parse_header()` → `ops->parse` (wrapped);
- `dev_parse_header_protocol()` → `if (!ops || !ops->parse_protocol) return 0;`
  (NULL-safe);
- `neigh_hh_init()` / `neigh_fill_hh_cache()` → `ops->cache` only if non-NULL
  (NULL → no L2 header caching, slow path, no type confusion);
- `dev_validate_header()` → `ops->validate` only if non-NULL.

So the bond loses `.cache`/`.cache_update`/`.validate`/`.parse_protocol`
functionality (a minor performance/functional regression for non-Ethernet
bonds), but **gains safety**: none of the slave's private-struct-dereferencing
callbacks can run against the bond's `netdev_priv()`.

## 4. Assumptions the fix makes

1. **The dangerous surface is `header_ops->create` (and `->parse`).** The fix
   assumes the type confusion is exploited through `dev_hard_header()` →
   `create` (and `dev_parse_header()` → `parse`). This matches the disclosed
   crash trace (`packet_sendmsg → packet_snd → dev_hard_header → ipgre_header
   → pskb_expand_head → BUG_ON`).
2. **The active slave is the right device to build the header against.** The
   wrapper uses `bond->curr_active_slave`. In active-backup mode (the mode the
   CVE and repro use) this is always the one up slave. If `curr_active_slave`
   is NULL (e.g. some non-active-backup modes with no selected slave), the
   wrapper returns 0 — no header is built, **no crash** (a functional no-op,
   not a security issue).
3. **`bond_setup_by_slave()` is the only place a non-Ethernet slave's
   `header_ops` lineage reaches the bond.** Verified by grep (see §2.3).
4. **It is sufficient to *replace* rather than *fully wrap* `header_ops`.** By
   substituting `&bond_header_ops` (create+parse only) for the slave's ops, the
   slave's other callbacks become unreachable on the bond. This is the fix's
   implicit decision and it is correct.

## 5. Code paths / inputs the fix does NOT cover — and why each is safe

| Candidate alternate path | Reaches netdev_priv type confusion? | Covered by fix? |
|---|---|---|
| **ip6gre** slave (IPv6 GRE, no remote) → `ip6gre_header()` reads `struct ip6_tnl` | **YES** (distinct sink: `net/ipv6/ip6_gre.c`, `ip6_tnl.hlen` at offset 264) | **YES** — `bond_header_create` delegates `ip6gre_header` to the slave `ip6gre1` device (validated in this variant run: `dev=ip6gre1 hlen=4`, no crash) |
| **ipip / sit / ip_vti / ip6_vti / ip6_tunnel / xfrm** slaves → `ip_tunnel_header_ops` | **NO** — `ip_tunnel_header_ops = { .parse_protocol = ip_tunnel_parse_protocol }` has **no `.create`**; `ip_tunnel_parse_protocol()` does **not** dereference `netdev_priv()` (it only inspects skb network-header data) | N/A (no type confusion to begin with); fix additionally NULLs `parse_protocol` on the bond |
| `.cache` / `.cache_update` of a non-Ethernet slave | ipgre/ip6gre `header_ops` define **only** `.create` (ip6gre) / `.create`+`.parse` (ipgre) — they have **no** `.cache`/`.cache_update` | N/A (no such callbacks on the GRE sinks); fix NULLs them on the bond regardless |
| `.validate` / `.parse_protocol` of a non-Ethernet slave | None of the GRE tunnel `header_ops` set these; `ip_tunnel_parse_protocol` doesn't touch `netdev_priv` | N/A; fix NULLs them on the bond |
| A second non-Ethernet slave of a different type | Rejected with `-EINVAL` at enslavement (type mismatch) | N/A |
| `curr_active_slave == NULL` when `dev_hard_header()` runs | Wrapper returns 0 (no header built); on the vulnerable kernel the bond's copied `header_ops->create` would still run with `dev=bond` → type confusion | Fixed kernel: safe no-op (no crash) |
| Ethernet first slave | Goes through `bond_ether_setup()` → `eth_header_ops` (safe) | N/A |

## 6. Behavior before vs after the fix (validated)

Same kernel image (Linux 7.0.0-rc2, commit `e3f5e0f22`, KASAN SMP, with
`pr_info` injected into `ipgre_header` and `ip6gre_header`); only `bonding.ko`
differs (vulnerable = pre-fix, fixed = post-fix).

### 6.1 Original ipgre path (from the repro stage)

| | dev passed to header cb | confused `hlen` | result |
|---|---|---|---|
| VULN | `bond1` (`netdev_priv`=struct bonding read as `struct ip_tunnel`, hlen offset 160) | populated to `0x961a63cc` | `needed` overflows negative → `pskb_expand_head` `BUG_ON(nhead<0)` → **panic** |
| FIXED | `gre1` (correct slave, `netdev_priv`=struct ip_tunnel, hlen=4) | write harmless | no crash, `RESULT: NOT VULNERABLE` |

### 6.2 ip6gre variant path (this stage — see `reproduction_steps.sh`)

| | dev passed to header cb | confused `hlen` | result |
|---|---|---|---|
| VULN | `bond1` (`netdev_priv`=struct bonding read as `struct ip6_tnl`, hlen offset 264) | populated to `0x961a63cc` | `needed = hlen + 40` overflows negative → `pskb_expand_head` `BUG_ON(nhead<0)` → **panic** (`Call Trace: ip6gre_header+0x14a/0x430 [ip6_gre]`) |
| FIXED | `ip6gre1` (correct slave, `netdev_priv`=struct ip6_tnl, hlen=4) | write harmless | no crash, `RESULT: NOT VULNERABLE` |

The fixed kernel's `bond_header_ops` delegates `ip6gre_header` to the slave
`ip6gre1` device, so `netdev_priv()` correctly returns the ip6gre tunnel's
`struct ip6_tnl`. The variant does **not** reproduce on the fixed kernel.

## 7. Why the original reproduction missed the ip6gre sink

`ip6gre_tunnel_init()` assigns `ip6gre_header_ops` **only** when
`ipv6_addr_any(&tunnel->parms.raddr)` is true (NBMA / no-remote mode):

```c
if (ipv6_addr_any(&tunnel->parms.raddr))
        dev->header_ops = &ip6gre_header_ops;
```

The repro stage's ip6gre probe created the tunnel with a **remote** address
(`ip link add ip6gre1 type ip6gre local fd00::1 remote fd00::2`), so `raddr`
was not the any-address, `ip6gre_header_ops` was **never assigned**, and
`ip6gre_header()` was never invoked (the repro log shows zero
`ip6gre_header` printks — only `ipgre_header` on `bond1`). This variant uses
`local fd00::1` with **no remote**, which assigns `ip6gre_header_ops` and
actually reaches the type-confused `ip6gre_header(bond1)` sink.

## 8. Threat-model scope (target's stated posture)

This is a **local** kernel denial-of-service. Constructing a bond, creating a
GRE/ip6gre tunnel, and enslaving the tunnel to the bond all require
`CAP_NET_ADMIN` (in the initial namespace or in a network namespace where the
attacker holds that capability). The attacker is therefore a privileged local
user (or an unprivileged user inside a `CAP_NET_ADMIN` network namespace) who
can crash the kernel — consistent with the CVE's **Medium** severity. There is
no `SECURITY.md` in the Linux tree that excludes this class; the kernel
treats a privileged-user-triggered type-confusion/Oops as a real bug
(reportedly hitting `BUG_ON` / KASAN). The trust boundary crossed is
"privileged user with `CAP_NET_ADMIN` issues network configuration that the
bonding driver mis-handles, crashing the kernel for all users on the host."

## 9. Verdict on fix completeness

The fix is **complete** for the disclosed class (type confusion via a
non-Ethernet slave's `header_ops->create` dereferencing `netdev_priv()`):

- It is **generic**: `bond_header_create()` delegates *whatever* the active
  slave's `header_ops->create` is (`ipgre_header`, `ip6gre_header`, or any
  future tunnel create) to the slave's own device, so `netdev_priv()` always
  receives the slave's correct private struct. It does not hardcode `ipgre`.
- It **replaces** (not merges) `header_ops`, so the slave's
  `.cache`/`.cache_update`/`.validate`/`.parse_protocol` are unreachable on
  the bond (NULL), eliminating any residual type confusion through those
  callbacks.
- The only tunnel `header_ops` that actually dereference `netdev_priv()` in a
  type-confusing way are `ipgre_header` (struct ip_tunnel) and `ip6gre_header`
  (struct ip6_tnl) — both `.create` sinks, both covered by the wrapper. The
  `ip_tunnel_header_ops` family (ipip/sit/vti/ip6_tunnel/xfrm) has no
  `.create` and its `.parse_protocol` does not touch `netdev_priv`.

**No bypass was found.** The ip6gre alternate trigger is a *distinct, previously
unvalidated* trigger of the *same* root cause that the fix covers. No
additional code change is required for the fix to cover ip6gre.

### Recommendation (advisory, not required for correctness)

The fix intentionally drops `.cache`/`.cache_update`/`.validate`/
`.parse_protocol` on non-Ethernet bonds. If full functional parity is desired
for non-Ethernet bonds (header caching, validation), those callbacks could be
wrapped analogously (`bond_header_cache` delegating to the active slave's
`.cache` with the slave device). This is a **functional enhancement**, not a
security gap — none of those callbacks on the GRE tunnel `header_ops`
dereference `netdev_priv`, and the fix already prevents the create/parse type
confusion. It can be pursued independently of CVE-2026-43456.
