# RCA Report: CVE-2026-32316

## Summary

CVE-2026-32316 is an integer overflow vulnerability in jq's string concatenation path (`jvp_string_append` in `src/jv.c`). When two strings whose combined length exceeds `INT_MAX` (2,147,483,647 bytes) are concatenated, the allocation size `(currlen + len) * 2` overflows `uint32_t` and wraps to a tiny value. A subsequent `memcpy` then writes gigabytes of data into the undersized heap buffer, causing a heap-based buffer overflow (CWE-190 → CWE-122).

## Impact

- **Package/component affected**: jq (command-line JSON processor), specifically `jv_string_concat` / `jvp_string_append` in `src/jv.c`
- **Affected versions**: `<= 1.8.1`
- **Risk level**: High (CVSS 3.1: 7.5)
- **Consequences**: Any service or pipeline that runs jq filters on attacker-controlled input can be crashed or have heap memory corrupted when the attacker supplies JSON strings that, when concatenated, exceed `INT_MAX` bytes. This is reachable in common patterns such as `add` over an array of large strings or direct `+` operations.

## Root Cause

In `src/jv.c`, `jvp_string_append` computes the new allocation size using 32-bit unsigned arithmetic:

```c
uint32_t allocsz = (currlen + len) * 2;
```

When `currlen + len >= INT_MAX` (≈ 2.1 GB), the product can exceed `UINT32_MAX` and wrap around to a very small number (as small as 0, which is then clamped to 32 bytes). The function then allocates this tiny buffer via `jvp_string_alloc(allocsz)` and copies the full concatenated length with `memcpy`, writing far past the allocated region.

The same overflow pattern existed in `jvp_string_copy_replace_bad`, where `length * 3 + 1` could wrap `uint32_t` for very large inputs.

**Fix commit**: `e47e56d226519635768e6aab2f38f0ab037c09e5` — adds explicit 64-bit overflow checks:

```c
if ((uint64_t)currlen + len >= INT_MAX) {
    jv_free(string);
    return jv_invalid_with_msg(jv_string("String too long"));
}
```

## Reproduction Steps

1. Execute `repro/reproduction_steps.sh`
2. The script clones the jq repository, builds two binaries with AddressSanitizer:
   - **Vulnerable**: tag `jq-1.8.1`
   - **Fixed**: commit `e47e56d226519635768e6aab2f38f0ab037c09e5`
3. The script runs the trigger program on both binaries:
   ```jq
   "A" * 1073741824 as $a | $a + $a
   ```
   This creates a 1 GB string and concatenates it with another 1 GB string, producing a total length of 2 GB (> `INT_MAX`).
4. **Expected evidence**:
   - Vulnerable build: AddressSanitizer reports `heap-buffer-overflow` in `jvp_string_append` (called from `jv_string_concat` → `binop_plus`)
   - Fixed build: jq exits gracefully with the error message `String too long`

## Evidence

- **Vulnerable ASAN output**: `logs/vulnerable_stderr.txt`
  ```
  ==7265==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x506000001311
  WRITE of size 1073741824 at 0x506000001311 thread T0
      #0 memcpy
      #1 jvp_string_append src/jv.c:1191
      #2 jv_string_concat src/jv.c:1501
      #3 binop_plus src/builtin.c:96
  ```
  The buffer allocated was only 49 bytes, confirming the integer-overflow-induced undersized allocation.

- **Fixed output**: `logs/fixed_stderr.txt`
  ```
  jq: error (at <unknown>): String too long
  ```

- **Runtime manifest**: `repro/runtime_manifest.json`

## Recommendations / Next Steps

1. **Upgrade jq to 1.8.2 or later** (or any build containing commit `e47e56d`). The fix is minimal and precisely targeted at the overflow conditions.
2. **Input size limits**: For deployments that cannot immediately upgrade, enforce maximum JSON input sizes well below 2 GB to reduce the likelihood of triggering the overflow.
3. **Regression testing**: Add automated tests that attempt string concatenation near `INT_MAX` boundaries to prevent re-introduction of this bug class.
4. **Audit other integer size calculations**: Review other uses of `uint32_t` for buffer-size math in `src/jv.c` (e.g., `jvp_string_copy_replace_bad`) to ensure similar overflows are not present elsewhere.

## Additional Notes

- **Idempotency confirmed**: `repro/reproduction_steps.sh` was run twice consecutively; both runs produced identical ASAN crash output on the vulnerable build and identical graceful error output on the fixed build.
- **Edge cases / limitations**: The reproduction requires a host with enough RAM to hold two ~1 GB strings plus ASAN overhead (≈ 3–4 GB peak). On memory-constrained systems, the trigger can be adjusted by reducing the multiplier, but the combined concatenated length must still exceed `INT_MAX` (2,147,483,647 bytes) to hit the vulnerable code path.
