# Root Cause Analysis: CVE-2026-33017

## Summary

CVE-2026-33017 is an unauthenticated remote code execution (RCE) vulnerability in
Langflow prior to version 1.9.0. The public flow-build endpoint
`POST /api/v1/build_public_tmp/{flow_id}/flow` accepts an attacker-controlled
`data` parameter (`FlowDataRequest`) containing arbitrary Python code inside a
custom component node. Because the endpoint is intentionally unauthenticated for
public flows, any remote attacker can reach it. The supplied flow definition is
passed through `start_flow_build()` → `build_graph_from_data()` →
`Graph.from_payload()` and ultimately to the custom-component loader, which
extracts the `code` field and executes it with `exec()` inside
`prepare_global_scope()` (in `lfx/custom/validate.py`) without any sandboxing.
A module-level assignment such as `_rce = os.system(...)` is an `ast.Assign`
node that `prepare_global_scope()` collects and `exec()`s at graph-build time,
yielding arbitrary command execution with the privileges of the Langflow server
process. A single HTTP request is sufficient.

## Impact

- **Product:** Langflow (PyPI package `langflow`; Docker image `langflowai/langflow`)
- **Affected versions:** `langflow < 1.9.0` (reproduction uses `1.8.1` as the
  vulnerable image).
- **Patched versions:** `>= 1.9.0` (the public build endpoint hardcodes
  `data=None` and loads the stored flow from the database only).
- **Risk level:** Critical (CISA KEV added 2026-03-25).
- **Consequences:** An unauthenticated, remote attacker can run arbitrary system
  commands, read environment variables (including LLM API keys / cloud
  credentials), access/modify the database and flow data, and establish
  persistence. The server process ran as `uid=1000(user) gid=0(root)` in the
  container image.

## Impact Parity

- **Disclosed/claimed maximum impact:** Unauthenticated remote code execution
  (code execution) via a single HTTP request to the public build endpoint.
- **Reproduced impact from this run:** Confirmed code execution. The vulnerable
  `langflowai/langflow:1.8.1` container wrote `/tmp/rce-proof` containing the
  output of the `id` command (`uid=1000(user) gid=0(root) groups=0(root)`) plus a
  unique per-attempt token, after receiving an **unauthenticated** HTTP POST to
  `/api/v1/build_public_tmp/{flow_id}/flow`. The fixed
  `langflowai/langflow:1.9.0` container did **not** write the proof file under
  the identical request (negative control).
- **Parity:** `full`.
- **Not demonstrated:** None relevant; the claimed unauthenticated-RCE impact
  was directly demonstrated end-to-end against the real product.

## Root Cause

The vulnerable endpoint `build_public_tmp` in
`src/backend/base/langflow/api/v1/chat.py` (v1.8.1) declares an inbound
`data: FlowDataRequest` parameter and forwards it directly to
`start_flow_build()`:

```python
@router.post("/build_public_tmp/{flow_id}/flow")
async def build_public_tmp(..., data: Annotated[FlowDataRequest | None, Body(embed=True)] = None, ...):
    owner_user, new_flow_id = await verify_public_flow_and_get_user(flow_id=flow_id, client_id=client_id)
    job_id = await start_flow_build(flow_id=new_flow_id, ..., data=data, ...)
```

`start_flow_build()` (`src/backend/base/langflow/api/build.py`) builds the graph
from the attacker-supplied data when it is present:

```python
async def create_graph(...):
    if not data:
        return await build_graph_from_db(...)
    return await build_graph_from_data(flow_id=..., payload=data.model_dump(), ...)
```

`build_graph_from_data()` → `Graph.from_payload()` constructs vertices from the
attacker nodes. For a custom component (`template._type == "Component"`), the
loader calls `create_class(code, class_name)` in `src/lfx/src/lfx/custom/validate.py`,
which calls `prepare_global_scope(module)`. That function iterates the module
body, collects top-level `ast.Assign` / `ast.AnnAssign` / `ast.ClassDef` /
`ast.FunctionDef` nodes into `definitions`, compiles them, and runs:

```python
if definitions:
    combined_module = ast.Module(body=definitions, type_ignores=[])
    compiled_code = compile(combined_module, "<string>", "exec")
    exec(compiled_code, exec_globals)   # <-- attacker module-level code runs here
```

Therefore a top-level `_rce = os.system("id > /tmp/rce-proof ...")` executes
during graph construction, before any output is produced.

The only access control on the endpoint is `verify_public_flow_and_get_user()`,
which merely checks that the targeted `flow_id` is marked `PUBLIC` in the
database and that a `client_id` cookie is present (any value). The attacker
creates the public flow themselves (using the AUTO_LOGIN superuser session), so
this check is satisfied trivially.

**Fix (v1.9.0):** the endpoint no longer accepts a `data` parameter and hardcodes
`data=None`, so the build always loads the stored flow definition from the
database. It also validates the stored flow with
`validate_flow_for_current_settings()` and rejects custom components on the
public path (`CustomComponentValidationError` → HTTP 400). The diff is the
removal of `data: ... = None` from the signature and `data=data` → `data=None`
in the `start_flow_build(...)` call.

```python
# v1.9.0
job_id = await start_flow_build(flow_id=new_flow_id, source_flow_id=flow_id,
    ..., data=None,  # Always None - public flows load from database only
    ...)
```

## Reproduction Steps

1. The reproduction is fully automated by
   `bundle/repro/reproduction_steps.sh` (with helper
   `bundle/repro/repro_attempt.py`).
2. The script pulls `langflowai/langflow:1.8.1` (vulnerable) and
   `langflowai/langflow:1.9.0` (fixed), then runs **2 vulnerable** and **2 fixed**
   isolated attempts. Each attempt:
   - starts a fresh Langflow container with `LANGFLOW_AUTO_LOGIN=true` and
     `--backend-only`,
   - waits for the `/health` endpoint,
   - performs `GET /api/v1/auto_login` to obtain a superuser access token,
   - creates a PUBLIC flow via `POST /api/v1/flows/`,
   - sends the unauthenticated exploit `POST
     /api/v1/build_public_tmp/{flow_id}/flow` with a `client_id` cookie and a
     body whose `data` contains one `CustomComponent` node whose `code` holds a
     top-level `_rce = os.system("id > /tmp/rce-proof && echo RCE_CONFIRMED
     <token> >> /tmp/rce-proof")`,
   - polls for `/tmp/rce-proof` inside the container and copies it out as
     evidence, then tears the container down.
3. Expected evidence: on the vulnerable image each attempt produces
   `logs/proof_vuln_N.txt` containing the `id` output and the unique token, with
   `exploit_status: 200` and `proof_exists: true` in `logs/result_vuln_N.json`.
   On the fixed image `proof_exists: false` for every attempt
   (`logs/result_fixed_N.json`).

## Evidence

- `bundle/logs/reproduction_steps.log` — full orchestrator log.
- `bundle/logs/result_vuln_{1,2}.json` — per-attempt JSON results for the
  vulnerable image (auto_login=200, create_flow=201, exploit=200,
  proof_exists=true, proof_content with `uid=1000(user)...` + token).
- `bundle/logs/proof_vuln_{1,2}.txt` — the proof file exfiltrated from the
  vulnerable container (`id` output + `RCE_CONFIRMED <token>`).
- `bundle/logs/result_fixed_{1,2}.json` — per-attempt JSON results for the fixed
  image (exploit=200 but proof_exists=false).
- `bundle/logs/container_{vuln,fixed}_{1,2}.log` — container startup/runtime
  logs.
- `bundle/repro/runtime_manifest.json` — structured runtime evidence
  (`entrypoint_kind=api_remote`, `service_started=true`,
  `healthcheck_passed=true`, `target_path_reached=true`).
- `bundle/repro/validation_verdict.json` — structured verdict.

Key excerpt from a manual run against `langflowai/langflow:1.8.1`:

```json
{"role":"vuln","token":"manualtest1","auto_login_status":200,"create_flow_status":201,
 "flow_id":"b43e6614-...","exploit_status":200,
 "exploit_body":"{\"job_id\":\"8449e0de-...\"}","proof_exists":true,
 "proof_content":"uid=1000(user) gid=0(root) groups=0(root)\nRCE_CONFIRMED manualtest1",
 "success":true}
```

Negative control against `langflowai/langflow:1.9.0` (identical request):

```json
{"role":"fixed","token":"fixedtest1","auto_login_status":200,"create_flow_status":201,
 "exploit_status":200,"proof_exists":false,"proof_content":null,"success":false}
```

Environment: Docker 29.6.1; official images `langflowai/langflow:1.8.1` and
`:1.9.0`; exploit executed via `docker exec` inside each container (the sandbox
cannot reach published host ports, so all HTTP traffic is generated inside the
container against `http://127.0.0.1:7860`). No sanitizers were used; this is a
non-sanitized production-path proof.

## Recommendations / Next Steps

- **Upgrade** to Langflow `>= 1.9.0` immediately. The public build endpoint no
  longer accepts client-supplied flow definitions and validates stored flows.
- If AUTO_LOGIN must stay enabled in production, restrict network exposure of
  the Langflow HTTP port and place it behind an authenticated reverse proxy;
  AUTO_LOGIN issues a superuser session without credentials.
- Consider disabling custom components entirely on public flows
  (`allow_custom_components=false`) and enforce `access_type=PRIVATE` by default.
- Add an integration test that posts a custom-component payload with a
  module-level side-effect sentinel to `build_public_tmp` and asserts it never
  fires, to prevent regressions of this fix.

## Additional Notes

- **Idempotency:** The script removes any prior `/tmp/rce-proof` at the start of
  each attempt and tears down the container afterwards, so consecutive runs are
  clean and reproducible. Verified by running two vulnerable and two fixed
  attempts back-to-back.
- The malicious payload is delivered as a top-level **assignment**
  (`_rce = os.system(...)`) rather than a bare expression, because
  `prepare_global_scope()` only `exec()`s nodes it classifies as `ast.Assign` /
  `ast.AnnAssign` / `ast.ClassDef` / `ast.FunctionDef`; an assignment is
  guaranteed to execute at graph-build time.
- The flow created for the exploit uses a benign empty `data`
  (`{"nodes":[],"edges":[]}`) simply to satisfy the `access_type=PUBLIC`
  requirement; on the vulnerable path the attacker-supplied `data` overrides the
  stored definition, so the stored content is irrelevant.
- The proof file is written inside the container filesystem and copied out via
  `docker cp` for durable evidence.
