# Patch Analysis: CVE-2026-32316

## Fix Commit
- **Commit**: `e47e56d226519635768e6aab2f38f0ab037c09e5`
- **Title**: Fix heap buffer overflow in `jvp_string_append` and `jvp_string_copy_replace_bad`
- **File changed**: `src/jv.c` (+10 lines, -1 line)

## What the Fix Changes

The fix adds two explicit 64-bit overflow checks:

1. **`jvp_string_append`** (line ~1179):
   ```c
   if ((uint64_t)currlen + len >= INT_MAX) {
     jv_free(string);
     return jv_invalid_with_msg(jv_string("String too long"));
   }
   ```
   Before allocating or copying, the function now rejects any append operation where the combined length of the existing string (`currlen`) and the incoming data (`len`) would reach or exceed `INT_MAX` (2,147,483,647 bytes).

2. **`jvp_string_copy_replace_bad`** (line ~1118):
   ```c
   uint64_t maxlength = (uint64_t)length * 3 + 1;
   if (maxlength >= INT_MAX) {
     return jv_invalid_with_msg(jv_string("String too long"));
   }
   ```
   When copying a UTF-8 string while replacing invalid bytes with `U+FFFD`, the worst-case output is `length * 3 + 1`. The fix promotes the multiplication to `uint64_t` and rejects the operation if the result would be ≥ `INT_MAX`.

## Fix Assumptions

The fix assumes that:
- **All string growth in jq goes through `jvp_string_append` or `jvp_string_copy_replace_bad`**. These are the only two functions in `src/jv.c` that perform dynamic string buffer allocation with arithmetic on `uint32_t` lengths.
- **Callers of `jvp_string_append` pass `len` as an honest `uint32_t`**. The check `(uint64_t)currlen + len >= INT_MAX` is mathematically sufficient to prevent `uint32_t` overflow in the subsequent `(currlen + len) * 2` allocation-size computation, because `INT_MAX = UINT32_MAX / 2 + 1`.
- **Strings that exceed `INT_MAX` bytes should be treated as an error** rather than supported.

## What the Fix Covers

A complete code-path audit shows that every jq string-concatenation or string-growth operation ultimately routes through one of the two patched functions:

- `jv_string_concat` → `jvp_string_append`
- `jv_string_append_buf` → `jvp_string_append`
- `jv_string_append_str` → `jv_string_append_buf` → `jvp_string_append`
- `jv_string_append_codepoint` → `jvp_string_append`
- `jv_string_repeat` → `jvp_string_append` (after its own `INT_MAX` check)
- `jv_dump_string` / `jv_dump_term` → `put_buf` → `jv_string_append_buf` → `jvp_string_append`
- `escape_string` (formatting builtins) → `jv_string_append_str/buf` → `jvp_string_append`
- `f_format` (`@csv`, `@tsv`, `@uri`, `@base64`, etc.) → `jv_string_concat` / `jv_string_append_str`
- `jv_string_implode` → `jv_string_append_codepoint` → `jvp_string_append`
- `sub`/`gsub` (jq-level) → `reduce` with `+` → `binop_plus` → `jv_string_concat`
- `add` over arrays of strings → iterative `+` → `binop_plus`
- `join` (jq-level) → `reduce` with `+` → `binop_plus`
- string interpolation → compiles to `+` → `binop_plus`

Therefore, the single gate in `jvp_string_append` defends all of these entry points.

## What the Fix Does NOT Cover (and why it does not matter for this CVE)

1. **Other `uint32_t` arithmetic in `src/jv.c`**: No other string-length multiplication or addition on `uint32_t` exists in `src/jv.c` besides the two that were patched.
2. **Array/object allocation overflows**: `jvp_array_alloc` and `jvp_object_new` compute sizes using `unsigned`/`int` values multiplied by `sizeof(...)`. On 64-bit systems the arithmetic is promoted to `size_t` and does not wrap in the same way. These are separate bug classes.
3. **Parser token-buffer growth (`jv_parse.c`)**: `p->tokenlen = p->tokenlen*2 + 256` uses `int`. Deeply-nested JSON could theoretically overflow, but this is bounded by `MAX_PARSING_DEPTH` and is unrelated to string concatenation.
4. **`jvp_string_new` / `jvp_string_empty_new`**: These allocate a buffer of a given `uint32_t` size without an `INT_MAX` check. However, every caller either:
   - already validated the length (e.g., `jv_string_repeat` checks `res_len < INT_MAX`), or
   - derives the length from an existing string whose size was bounded by the same allocator rules.
   A negative `int` cast to `uint32_t` could reach these with a huge length, but no jq filter or parser path is known to produce such a value.
5. **Format functions with `uint32_t` index overflow** (e.g., `@uri` using `uint32_t ri` for output length): This can cause result truncation when the encoded length exceeds `UINT32_MAX`, but it does **not** produce a heap-buffer-overflow because the underlying buffer is allocated with a safe `size_t` computation.

## Comparison: Behavior Before and After

| Scenario | Vulnerable (`jq-1.8.1`) | Fixed (`e47e56d`) |
|---|---|---|
| `$a + $b` where `|a|+|b| ≥ INT_MAX` | `uint32_t` overflow → tiny allocation → `memcpy` heap-buffer-overflow | Returns jq error: `String too long` |
| `add` over array of large strings | Same overflow via iterative `jv_string_concat` | Caught in first append that would cross `INT_MAX` |
| `reduce` building string step-by-step | Same overflow when total crosses `INT_MAX` | Caught at the step that crosses `INT_MAX` |
| Invalid-UTF8 string copy with `length*3+1 ≥ INT_MAX` | `uint32_t` overflow → tiny allocation → large write | Returns jq error: `String too long` |

## Verdict

The fix is **precise and complete** for the specific integer-overflow-to-heap-buffer-overflow bug chain described by CVE-2026-32316. There is no known bypass because the vulnerable sink (`jvp_string_append`) is a chokepoint for all string growth, and the `INT_MAX` check is mathematically sufficient to prevent the `uint32_t` wrap-around that caused the original crash.

No additional code changes are required to close this specific gap, but a defense-in-depth recommendation would be to add a similar `INT_MAX` check to `jvp_string_new` and `jvp_string_empty_new` so that **all** string constructors enforce the same length bound, not just the two growth functions.
