Skip to content

fix: only add string_view life support for transient sources#6096

Open
henryiii wants to merge 2 commits into
pybind:masterfrom
henryiii:fix-string-view-life-support-no-frame
Open

fix: only add string_view life support for transient sources#6096
henryiii wants to merge 2 commits into
pybind:masterfrom
henryiii:fix-string-view-life-support-no-frame

Conversation

@henryiii

@henryiii henryiii commented Jun 26, 2026

Copy link
Copy Markdown
Collaborator

🤖 AI text below 🤖

Description

Follow-up to #6092.

#6092 added loader_life_support::add_patient(src) whenever a string view (std::string_view) is loaded, to fix a real use-after-free: when a container of views is built from a non-sequence iterable (e.g. a generator), the container caster materializes a temporary (a tuple/dict, or the generator's per-iteration items) that owns the strings and releases it before the bound function body runs, leaving the views dangling. Confirmed under AddressSanitizer (freed-by: pybind11::tuple::~tuple()tupledeallocunicode_dealloc; UAF read in the function body). So the fix is needed — reverting #6092 reintroduces the UAF.

However, add_patient throws when there is no life support frame (i.e. outside a bound function dispatch), so casting a view via a manual py::cast<std::string_view>(obj) in helper/embedding code started throwing even though the caller owns obj — a regression hit in real codebases.

Approach

The view caster cannot tell a durable, caller-owned source from a transient, pybind11-managed one (a tuple materialized from a generator, or a generator element the iterator only borrows). That provenance lives in the container caster. This PR introduces an ambient detail::transient_source_guard that the list, set, map, and array casters set around their generator/materialized paths, and the string caster keeps the source alive only when loading from a transient source.

Result:

source inside bound function outside (no frame)
durable (direct arg, list/set/dict, manual cast) no life support added works (no throw)
transient (generator/iterator) kept alive via life support throws (cannot be made safe)

This addresses both goals: life support is added only when actually needed, and a generator used outside a frame fails loudly instead of silently dangling. The guard restores (rather than clears) the previous value, so a durable container nested in a transient one is correctly treated as transient.

The pre-existing add_patient(utfNbytes) for UTF-16/32 views stays unconditional: there the view points into a freshly re-encoded temporary that always needs life support regardless of source.

Tests

  • New Catch2 tests in test_interpreter.cpp (no-frame context, which m.def lambdas can't reproduce since they always have a frame): casting a view from a durable source (str/bytes/bytearray/list) outside a bound function succeeds; casting std::vector<std::string_view> from a generator outside a bound function throws cast_error.
  • fix: add life support to handles cast to string_view #6092's existing test_stl generator test continues to cover the in-frame UAF (CI runs it under ASAN).

Verified locally with a standalone libpython+ASAN harness across the full matrix above: in-frame generator is ASAN-clean, out-of-frame durable cases succeed, out-of-frame generator throws.

Suggested changelog entry:

  • String views (e.g. std::string_view) are now kept alive only when loaded from a transient source (such as a generator), fixing both a use-after-free and a regression where casting a view from a durable object outside a bound function would throw.

@henryiii henryiii force-pushed the fix-string-view-life-support-no-frame branch from b8a651e to a46ca00 Compare June 26, 2026 15:12
@henryiii henryiii changed the title fix: don't throw from string_view life support outside a bound function fix: only add string_view life support for transient sources Jun 26, 2026
@henryiii henryiii force-pushed the fix-string-view-life-support-no-frame branch 2 times, most recently from e785283 to 13bc9b4 Compare June 26, 2026 15:36
henryiii added 2 commits June 26, 2026 12:04
PR pybind#6092 added loader_life_support::add_patient(src) to keep the source
object alive when loading a string view, fixing a real use-after-free when
a container of views is built from a non-sequence iterable (e.g. a
generator): list_caster materializes a temporary tuple that owns the
strings and destroys it when load() returns, before the bound function
body runs.

add_patient throws when there is no life support frame, so casting to a
view outside a bound function (e.g. a manual py::cast<std::string_view>)
now raises instead of relying on the caller-owned source, a regression
from pybind#6092.

For these view-into-src cases registration is best effort: inside a bound
function it keeps src alive (fixing the UAF), and outside one the caller
owns src's lifetime as before. Add try_add_patient(), which returns false
instead of throwing when there is no frame, and use it at the three view
load sites. add_patient() keeps its strict contract for value-creating
conversions.

Assisted-by: ClaudeCode:claude-opus-4.8
Refine the previous commit. Best-effort registration (try_add_patient)
silently produces a dangling view when a container of views is built from
a generator outside a bound function: there the materialized temporary is
released before the view is used, and with no frame nothing keeps it
alive. Such a cast cannot be made safe, so it should fail loudly, while a
view into a durable, caller-owned object needs no life support at all.

The view caster cannot tell a durable source from a pybind11-managed
transient one; that provenance lives in the container caster. Introduce an
ambient transient_source_guard that the list, set, map, and array casters
set around their generator/materialized paths, and have the string caster
keep the source alive only when loading from a transient source (via the
throwing add_patient, so try_add_patient is no longer needed). This means:

- views into durable sources (direct arguments, sequences, manual casts)
  add no life support and no longer throw outside a bound function, and
- a generator used outside a frame throws, rather than silently dangling.

The guard restores (rather than clears) the previous value, so a durable
container nested in a transient one is correctly treated as transient.

Verified with AddressSanitizer: the in-frame generator case is clean, the
out-of-frame durable cases succeed, and the out-of-frame generator case
throws.

Assisted-by: ClaudeCode:claude-opus-4.8
@henryiii henryiii force-pushed the fix-string-view-life-support-no-frame branch from 13bc9b4 to c31659c Compare June 26, 2026 16:05
// a bound function (no life support frame) the latter cannot be made safe, so `add_patient`
// rightly throws rather than producing a dangling view.
inline bool &loading_from_transient_source() {
static thread_local bool flag = false;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like an ugly hack. Surely claude can suggest a different way?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't pass anything through, though? Didn't seem obvious to me but I can poke at it a bit.

@charlesbeattie

Copy link
Copy Markdown
Contributor

Should we rollback #6092 until this or similar solution is submitted? The bug is real but the problems caused by that change are not worth it with this one.

@rwgk

rwgk commented Jun 28, 2026

Copy link
Copy Markdown
Collaborator

Should we rollback #6092 until this or similar solution is submitted? The bug is real but the problems caused by that change are not worth it with this one.

I haven't look at this PR yet, but rolling back #6092 sounds like the right step, to cleanly go back to the original, long-lived baseline. @charlesbeattie if you send the rollback, I'll merge it straightaway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants