# Variant Root Cause Analysis: CVE-2026-33017 — Stored-Custom-Component Bypass of the v1.9.0 Fix

## Summary

CVE-2026-33017 was fixed (per the ticket and the repro) in Langflow **1.9.0** by
removing the client-supplied `data` parameter from the unauthenticated public
flow-build endpoint `POST /api/v1/build_public_tmp/{flow_id}/flow` and hardcoding
`data=None` so the build always loads the flow definition from the database. This
variant is a **true bypass of that fix**: the same unauthenticated public build
path still `exec()`'s each node's stored custom-component `code` at graph-build
time via `prepare_global_scope()`/`eval_custom_component_code`, and the only
validator the fix added (`validate_flow_for_current_settings`) is a **no-op**
under the default `allow_custom_components=true`. By first storing a malicious
`CustomComponent` inside a **PUBLIC** flow (via `POST /api/v1/flows/` using the
`AUTO_LOGIN` superuser token — the exact capability the original CVE repro
already uses to create its public flow), an attacker triggers the public build
with nothing but a `client_id` cookie and obtains **unauthenticated RCE** on the
CVE's "fixed" `langflowai/langflow:1.9.0`. The bypass was confirmed empirically
(2/2 attempts: `exploit_status=200`, proof file written with
`uid=1000(user) gid=0(root)`). It is closed only in **v1.10.1** by the upstream
follow-up commit `626365f088` ("run trusted server code on unauthenticated public
flow builds", the H1-3754930 follow-up); versions **1.9.0 through 1.10.0** remain
vulnerable.

## Fix Coverage / Assumptions

**Invariant the original (v1.9.0) fix relies on:** "Public flows never accept
client-supplied data, and the stored flow definition is benign, and
`validate_flow_for_current_settings` will block any custom component that should
not run on the public path."

**Code path(s) it explicitly covers:**
- The in-request `data` body field of `POST /api/v1/build_public_tmp/{flow_id}/flow`
  (`src/backend/base/langflow/api/v1/chat.py:640`): the `data` parameter was
  removed and `data=None` hardcoded (`chat.py:720`), so attacker flow data can no
  longer be supplied in the request body. `source_flow_id=flow_id` (`chat.py:717`)
  forces the build to load the flow from the DB via `build_graph_from_db`
  (`src/backend/base/langflow/api/build.py:305-313`).

**What the fix does NOT cover (the gap):**
- The **stored-data** injection vector. `POST /api/v1/flows/`
  (`src/backend/base/langflow/api/v1/flows.py:86`, `_new_flow`) performs **no**
  custom-component validation when persisting a `PUBLIC` flow, so an attacker can
  store a flow whose `data` contains an arbitrary-code `CustomComponent`.
- The validator it relies on is a **no-op** under the default configuration.
  `validate_flow_for_current_settings` → `check_flow_and_raise`
  (`src/lfx/src/lfx/utils/flow_validation.py:158-166`) returns immediately when
  `allow_custom_components` is `True`, and `allow_custom_components` defaults to
  `True` (`src/lfx/src/lfx/services/settings/base.py:386`). The stored malicious
  custom component is therefore **not** blocked.
- Consequently the public build still `exec()`'s the node's stored `code` at
  graph-build time (`src/lfx/src/lfx/custom/validate.py:218-222`).

## Variant / Alternate Trigger

**Bypass path (same root cause, same sink, different injection point):**

1. `GET /api/v1/auto_login` → superuser `access_token` (no credentials;
   `LANGFLOW_AUTO_LOGIN=true`, identical to the parent repro's setup).
2. `POST /api/v1/flows/` with `{"name": ..., "access_type": "PUBLIC", "data":
   <malicious_graph>}` — store a PUBLIC flow whose `data` contains one
   `CustomComponent` node whose `code` has a top-level
   `_rce = os.system("id > /tmp/rce-proof ... && echo RCE_CONFIRMED <token> >> ...")`.
   The top-level assignment is an `ast.Assign` node that
   `prepare_global_scope()` collects and `exec()`'s at graph-build time, before
   any component method runs.
3. `POST /api/v1/build_public_tmp/{flow_id}/flow` with **only** a `client_id`
   cookie and **no** `data` field in the body. The server loads the stored
   (malicious) flow from the DB and `exec()`'s the node code → RCE.

**Entry point:** `POST /api/v1/build_public_tmp/{flow_id}/flow`
(unauthenticated; `client_id` cookie only), with the malicious payload
pre-staged in the DB via the authenticated `POST /api/v1/flows/` flow-create
endpoint (AUTO_LOGIN token).

**Specific code path:**
`chat.py:build_public_tmp` (640) → `validate_flow_for_current_settings(flow.data)`
(no-op, 711) → `start_flow_build(data=None, source_flow_id=flow_id)` (717-720) →
`build.py:generate_flow_events`/`create_graph` (305-313) →
`build_graph_from_db` → `Graph.from_payload` → `create_class` →
`prepare_global_scope` (`custom/validate.py:218-222`) → `exec()`.

## Impact

- **Package/component:** Langflow (`langflow` PyPI; `langflowai/langflow` Docker
  image). Sensitive component: the unauthenticated public flow-build endpoint and
  the custom-component loader (`lfx/custom/validate.py`).
- **Affected versions (as tested):** `langflowai/langflow:1.9.0` (CVE's "fixed"
  version) is **still vulnerable**. Source analysis shows the bypass affects
  **1.9.0 through 1.10.0** (all versions that have the original fix but lack the
  H1-3754930 follow-up). The bypass is **closed in 1.10.1**.
- **Risk level:** Critical — unauthenticated remote code execution with a single
  HTTP request to the public build endpoint (plus one authenticated flow-create
  request that needs no credentials under `AUTO_LOGIN=true`). The server process
  ran as `uid=1000(user) gid=0(root)` in the container.
- **Consequences:** Arbitrary command execution as the Langflow server user,
  enabling exfiltration of LLM API keys / cloud credentials, database/flow
  tampering, and persistence — identical to the parent CVE.

## Impact Parity

- **Disclosed/claimed maximum impact (parent CVE):** Unauthenticated remote code
  execution via the public build endpoint.
- **Reproduced impact from this variant run:** Unauthenticated remote code
  execution via the public build endpoint on the CVE's "fixed" 1.9.0 (proof file
  containing `id` output + per-attempt token, `exploit_status=200`).
- **Parity: `full`.** The variant achieves the same unauthenticated RCE as the
  parent CVE, on the version the CVE claims is fixed.
- **Not demonstrated:** Beyond `os.system` command execution, no separate
  read/write primitive was exercised (same scope as the parent repro).

## Root Cause

The underlying bug — **`exec()` of attacker-controlled custom-component `code`
on the unauthenticated public build path with no sandboxing** — is unchanged.
The v1.9.0 fix only relocated the trust boundary: it stopped trusting the
*request body* `data`, but continued trusting the *database-stored* `data` and
continued `exec()`'ing its node `code`. Because the only validator guarding the
stored data (`validate_flow_for_current_settings`) short-circuits on the default
`allow_custom_components=true`, and because flow creation
(`POST /api/v1/flows/`) performs no custom-component validation, an attacker can
smuggle arbitrary code into a PUBLIC flow and have the unauthenticated public
build execute it.

**Fix commit that closes this variant (upstream follow-up):**
`626365f088379236776e0d72f7d18c9094e43ebb` —
*"fix(security): run trusted server code on unauthenticated public flow builds
(#13540)"*. It is an ancestor of `v1.10.1` (commit `a66b75ac26`) but **not** of
`v1.10.0`/`v1.9.0` (`git merge-base --is-ancestor` verified). Its own message
confirms the gap: *"… a public flow containing a plain CustomComponent — or any
node carrying arbitrary code — would execute on that path without authentication
(follow-up to H1-3754930)."*

## Reproduction Steps

1. The bypass is fully automated by
   `bundle/vuln_variant/reproduction_steps.sh` (helper
   `bundle/vuln_variant/variant_attempt.py`, run inside each container).
2. The script pulls `langflowai/langflow:1.9.0` (CVE "fixed") and
   `langflowai/langflow:1.10.1` (follow-up fixed), records each image's exact
   version/identity, then runs **2 attempts against 1.9.0** and **2 attempts
   against 1.10.1**. Each attempt:
   - starts a fresh container with `LANGFLOW_AUTO_LOGIN=true --backend-only`,
   - waits for `/health`,
   - `GET /api/v1/auto_login` → superuser token,
   - `POST /api/v1/flows/` → creates a **PUBLIC** flow whose stored `data`
     carries a `CustomComponent` with a top-level `_rce = os.system(...)` payload,
   - `POST /api/v1/build_public_tmp/{flow_id}/flow` with only a `client_id`
     cookie and **no** `data` body field,
   - polls for `/tmp/rce-proof` and copies it out as evidence, then tears the
     container down.
3. **Expected evidence (observed):**
   - On `1.9.0`: each attempt writes `logs/vuln_variant/proof_claimed_fixed_N.txt`
     with `uid=1000(user) gid=0(root) groups=0(root)` + `RCE_CONFIRMED <token>`,
     and `logs/vuln_variant/result_claimed_fixed_N.json` shows
     `exploit_status: 200`, `proof_exists: true`, `success: true`.
   - On `1.10.1`: `logs/vuln_variant/result_followup_fixed_N.json` shows
     `exploit_status: 400`, `exploit_body: {"detail":"This flow cannot be
     executed."}`, `proof_exists: false`, `success: false`.
4. **Exit semantics:** exit `0` = bypass reproduced on the CVE-claimed-fixed
   `1.9.0` (true bypass of the claimed fix) and not reproduced on `1.10.1`.

## Evidence

- `bundle/vuln_variant/reproduction_steps.sh` — orchestrator (idempotent; verified
  by two consecutive runs, both exit 0).
- `bundle/vuln_variant/variant_attempt.py` — in-container exploit helper.
- `bundle/logs/vuln_variant/reproduction_steps.log` — full orchestrator log.
- `bundle/logs/vuln_variant/result_claimed_fixed_{1,2}.json` — 1.9.0 per-attempt
  results (auto_login=200, create_flow=201, exploit=200, proof_exists=true).
- `bundle/logs/vuln_variant/proof_claimed_fixed_{1,2}.txt` — exfiltrated proof
  files from the 1.9.0 containers.
- `bundle/logs/vuln_variant/result_followup_fixed_{1,2}.json` — 1.10.1
  per-attempt results (exploit=400 "This flow cannot be executed.",
  proof_exists=false).
- `bundle/logs/vuln_variant/container_{claimed,followup}_fixed_{1,2}.log` —
  container startup/runtime logs.
- `bundle/logs/vuln_variant/{claimed_fixed,followup_fixed}_image_identity.txt` —
  exact tested image version metadata (pip `1.9.0` / `1.10.1`).
- `bundle/vuln_variant/runtime_manifest.json`,
  `bundle/vuln_variant/validation_verdict.json`,
  `bundle/vuln_variant/source_identity.json`,
  `bundle/vuln_variant/root_cause_equivalence.json`,
  `bundle/vuln_variant/variant_manifest.json` — structured records.

Key excerpts (1.9.0, bypass succeeds):
```json
{"role":"claimed_fixed","mode":"bypass","token":"dde4c416676ff815","endpoint":"public_stored",
 "auto_login_status":200,"create_flow_status":201,"flow_id":"c9522932-...","exploit_status":200,
 "exploit_body":"{\"job_id\":\"eadbad1c-...\"}","proof_exists":true,
 "proof_content":"uid=1000(user) gid=0(root) groups=0(root)\nRCE_CONFIRMED dde4c416676ff815",
 "success":true}
```
Negative control (1.10.1, bypass closed):
```json
{"role":"followup_fixed","mode":"bypass","token":"e9b8fca5e5acaa53","endpoint":"public_stored",
 "auto_login_status":200,"create_flow_status":201,"flow_id":"ccf86bca-...","exploit_status":400,
 "exploit_body":"{\"detail\":\"This flow cannot be executed.\"}","proof_exists":false,
 "proof_content":null,"success":false}
```

Environment: Docker 29.6.1; official images `langflowai/langflow:1.9.0` and
`:1.10.1`; exploit executed via `docker exec` inside each container against
`http://127.0.0.1:7860` (the sandbox cannot reach published host ports). No
sanitizers were used; this is a non-sanitized production-path proof. Source
identity resolved via image build metadata (pip version) → release tag → git
commit: `1.9.0` → `a47f2ad17e` (bypass reproduces), `1.10.1` → `a66b75ac26`
(bypass closed).

## Recommendations / Next Steps

1. **Upgrade to Langflow ≥ 1.10.1**, which substitutes the server's trusted code
   for known component types on the public path and rejects unknown/custom
   component types carrying code (`prepare_public_flow_build`). Anyone running
   1.9.0–1.10.0 is still exposed to this bypass.
2. Make public-path code-execution protection **independent** of the global
   `allow_custom_components` flag. The flag governs *authenticated* behavior;
   it must not silently disable *unauthenticated* public-path protection. The
   `allow_public_custom_components` opt-in (default `false`) introduced in
   v1.10.1 is the correct pattern and should be preserved.
3. Validate custom-component code at **flow storage time**
   (`POST /api/v1/flows/` / `PATCH`) for `access_type=PUBLIC` flows, so a
   malicious public flow can never be persisted.
4. Extend the same hardening to **every** public graph-build/run entry point
   (e.g. any future public build or run endpoint), not only the one named in the
   CVE, and add a regression test that stores a malicious `CustomComponent` in a
   public flow, triggers the public build, and asserts the server's trusted code
   runs instead (assert: no proof side-effect, no attacker code path reached).
5. Treat `LANGFLOW_AUTO_LOGIN=true` as incompatible with public-flow exposure:
   it issues a superuser session with no credentials, which is the precondition
   that lets an anonymous attacker create the malicious public flow used here.

## Additional Notes

- **Bypass vs. alternate trigger:** This is a **bypass** of the CVE's stated fix
  (1.9.0): it reproduces unauthenticated RCE on the version the CVE says is
  fixed. It is closed by a later, separate upstream follow-up (1.10.1), which the
  maintainer's own commit message attributes to the same class of issue
  ("follow-up to H1-3754930").
- **Trust boundary:** Identical to the parent CVE — unauthenticated public build
  path. The flow-creation pre-step uses the `AUTO_LOGIN` superuser token (no
  credentials), exactly as the parent repro does to create its public flow. This
  is within Langflow's stated threat model (unauthenticated visitors must never
  execute arbitrary custom-component code on the public path); it is **not** a
  documented/accepted behavior.
- **Idempotency:** The script removes any prior `/tmp/rce-proof` at the start of
  each attempt, uses unique per-attempt tokens, and tears the container down
  afterwards. Verified by two consecutive full runs (both exit 0, both
  2/2 on 1.9.0 and 0/2 on 1.10.1).
- **Not conflated with a separate bug:** Same root cause (exec of
  attacker-controlled custom-component `code` on the unauthenticated public
  build path) and same sink (`prepare_global_scope` → `exec`) as the parent CVE;
  only the injection point (request body vs. stored DB data) differs.
- **Edge case considered and ruled out as a *different* vector (not claimed
  here):** the *authenticated* `POST /api/v1/build/{flow_id}/flow` endpoint still
  accepts a `data` parameter and, under `allow_custom_components=true`, its
  `validate_flow_for_current_settings` is also a no-op — so an AUTO_LOGIN token
  holder can RCE via that authenticated path too. That is a different (higher)
  trust boundary and is by-design per Langflow's threat model ("allow_custom_components
  grants custom-code execution to authenticated users"); it is therefore **not**
  claimed as the variant. The reported variant stays strictly on the
  **unauthenticated** public path, matching the parent CVE's trust boundary.
