gh-148925: Add safe PyUnstable_* APIs for call-stack iteration by danielsn · Pull Request #148930 · python/cpython

danielsn · 2026-04-23T19:53:25Z

#145559 proposes a signal-safe mechanism to print backtraces. This is useful for human debugging, but not ideal for tools because it requires parsing human-readable text. For example, a crash-tracker may wish to export and analyze structured backtraces. Similarly, a profiler can benefit from the ability to generate structure stack-traces in pure C, without needing to reason about the safety of reentering the python interpreter.

Issue: Expose signal-safe way to get a structured stacktrace #148925

📚 Documentation preview 📚: https://cpython-previews--148930.org.readthedocs.build/

…CallStack

python-cla-bot · 2026-04-23T19:53:30Z

All commit authors signed the Contributor License Agreement.

pablogsal · 2026-04-29T13:39:31Z

Will try to review this week. Heads up that we are very close to beta freeze so we would need to land this this week or next week at the latest

pablogsal

I'm not sure the cost/benefit pencils out here.

This is a meaningful amount of new code to carry: a new public-ish struct with fixed buffers, a new format_ascii encoder that largely duplicates _Py_DumpASCII, a new collect/print pair, plus a refactor of dump_traceback. The safety story is subtle enough that it requires a maintainer-contract banner just to keep future contributors from breaking it (no malloc, no refcount, no GIL, no non-AS-safe libc, etc.), and even with that discipline in place the contract is admittedly best-effort as CollectCallStack itself documents that it "may even rash itself."

A third-party extension can already do all of this today: include the CPython headers, build against Py_BUILD_CORE-style internals if needed, and walk frames using the existing PyUnstable_InterpreterFrame_* primitives. That's always possible. But landing it here flips the direction of the maintenance burden instead of profiler vendors maintaining their own stack-walker against each CPython version (which they already do for everything else they care about), we take on maintaining it for them, with the additional constraint that the AS-safety invariants must be preserved by every future contributor touching traceback.c. That's a real cost, not a free PyUnstable_* escape hatch.

It's also worth noting that this only helps a fairly narrow class of tools: in-process profilers and crash handlers running inside the target interpreter. The majority of production Python profilers are out-of-process. Those tools get nothing from this API; they need stable struct layouts and offsets they can read remotely, not an in-process C function. So we'd be adding a non-trivial maintenance surface to CPython to serve a subset of in-process tooling, while the dominant profiler architecture is unaffected.

That said, I want to think about this more before taking a firm position, and I'd like to hear other core devs weigh in.

bedevere-app · 2026-04-30T15:04:38Z

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

pablogsal · 2026-04-30T15:05:08Z

CC @vstinner @lysnikolaou

vstinner · 2026-04-30T15:41:35Z

It was proposed to make PyUnstable_DumpTraceback() public: see PR gh-148145. IMO it would already be a first step forward. I prefer to wait until that's done to consider adding another similar-but-different API.

danielsn · 2026-05-07T18:49:06Z

Thanks for the review @pablogsal and @vstinner. I talked with @pablogsal offline, and I think we can indeed push most of the work onto the caller. The one remaining piece is a signal-safe way to walk the stack frames themselves, which I think can be done with a fairly small API change. I'll test this out and get back to you with an updated diff :)

…ration Add four new public APIs for walking the Python call stack without allocating memory, changing reference counts, or acquiring the GIL, making them safe to call from signal handlers and custom memory allocator hooks: - PyUnstable_ThreadState_GetInterpreterFrame(tstate): returns the innermost complete interpreter frame, skipping entry trampolines and pre-RESUME frames automatically. - PyUnstable_InterpreterFrame_GetNextComplete(frame): returns the next complete calling frame, skipping incomplete frames. Mirrors PyFrame_GetBack semantics without allocating. - PyUnstable_InterpreterFrame_GetCodeSafe(frame): returns a borrowed reference to the code object, or NULL if freed memory is detected. - PyUnstable_InterpreterFrame_GetLineSafe(frame): returns the current line number, validating the instruction offset rather than asserting, safe when the frame may be partially torn down. All four use _Py_NO_SANITIZE_THREAD to suppress intentional racy reads and heuristics (_PyMem_IsPtrFreed) to detect freed memory. Switch tracemalloc and the fatal error traceback printer to use the new APIs instead of internal helpers. Add tests in Lib/test/test_capi/test_misc.py and test helpers in Modules/_testinternalcapi.c.

danielsn · 2026-05-11T20:44:18Z

+PyUnstable_ThreadState_GetInterpreterFrame(PyThreadState *tstate)
+{
+    _PyInterpreterFrame *frame = tstate->current_frame;
+    while (frame != NULL) {


We could add a "hop counter" to ensure that this function would deterministically exit

read-the-docs-community · 2026-05-11T20:44:56Z

Documentation build overview

📚 cpython-previews | 🛠️ Build #32858062 | 📁 Comparing b725a08 against main (448d7b9)

🔍 Preview build

127 files changed · + 4 added · ± 123 modified

+ Added

± Modified

danielsn · 2026-05-11T20:47:15Z

@pablogsal I have reworked the proposal to add the missing iterator function, with the rest of the work able to happen on the client side. LMK if this looks any better :)

danielsn · 2026-05-12T18:54:06Z

+   .. versionadded:: 3.15
+
+
+.. c:function:: struct _PyInterpreterFrame* PyUnstable_ThreadState_GetInterpreterFrame(PyThreadState *tstate)


Should these have a Safe suffix?

vstinner · 2026-05-13T11:42:32Z

It's hard or even impossible to write a reliable Python in-process debugger/profiler. For example, HEAD_LOCK() is missing whereas it's needed to iterate safely on interpreters and thread states: see issue gh-85234. Interpreters, thread states, frames and code objects can be added and destroyed anytime. For example, it's possible to get crashes when running faulthandler on a program which creates/destroys many threads in parallel while faulthandler is running: #116008 (comment).

"Safe" in _PyFrame_SafeGetCode() name is a trap. It's not safe, it can still crash if the frame became a dangling pointer in the meanwhile, and it returns a borrowed reference whereas the code object can be destroyed anytime.

faulthandler has to use heuristics such as checking _PyObject_IsFreed() to reduce the risk of crashes. But it's only a weak workaround, not a reliable solution. faulthandler uses a signal handler and so cannot use the Python C API or even allocate memory on the heap.

An out-of-process debugger/profiler is more reliable since it gets an error instead of a stack fault when reading invalid memory locations.

Building a more reliable in-process debugger/profiler would require synchronization and so call functions, change reference count, allocate memory on the heap, etc. It would be incompatible with a signal hander.

danielsn · 2026-05-19T15:40:50Z

Thanks for the links @vstinner.

There are use-cases I can see for an API like this, and I believe 2 of them still apply.

The first use-case is profiling from an in-process hook/wrapper (e.g. allocation/live-heap/lock/io profiling). In these cases, the system is in a stable state and the GIL is held (if in non-freethreaded mode). This is how tracemalloc works as I understand it, and we would like to build similar observability in userspace. The problem is that the current public API for stackwalking potentially reenters the interpreter, which makes reasoning about correctness difficult. Are there issues that would prevent a non-reentrant API working in this case?
The second use-case is crash-reporting: i.e. doing what the current fault-handler does, but in userspace to generate structured data instead of free-text. In this case, any operations are clearly only best effort since we're inside a crashing program. I believe this case is still valuable, but with better naming to indicate that while the functions do not signal-unsafe operations like retentering the interpreter or allocating memory, this does not mean they are “safe”, merely “less unsafe”.
The third is an in-process signal-based CPU profiler, and I agree that sadly this wouldn’t work for that. I will try to propose better naming to indicate that this is not a supported use-case at this time.

danielsn · 2026-05-26T18:52:31Z

@vstinner @pablogsal I have updated the function names and descriptions to follow the discussion above.

vstinner · 2026-05-28T17:06:41Z

Adding a new C API for in-process debugger/profiler would mean that we endorse this use case. I would prefer to not endorse it for the reason Pablo and me gave above. So I don't think that's a good idea to add such C API.

For example, PyUnstable_ThreadState_GetCurrentFrame() doesn't hold a strong reference and so can become a dangling pointer anytime. Other functions have a similar issue.

P403n1x87 · 2026-06-03T09:34:30Z

I understand some of the raised concerns have been discussed offline, so apologies if any of the following comments are redundant or moot at this point.

The majority of production Python profilers are out-of-process.

I feel that some of the (valid) points raised in previous comments hinge on putting all tools such as profilers into the same basket. In my experience, there is a set of categories that one would have to consider. Not all profilers are the same, and some make sense as out-of-process, but others are only viable as in-process. A sampling profiler (for wall/CPU time) is something that one might want to implement as an out-of-process tool, for various reasons. The stochastic nature of the tool means that a few incorrect stack walks can be tolerated. However, memory profilers cannot rely on a similar mechanism if they are to run with performance decent enough to be used in production. That's because the inevitable upscaling from a sampling mechanism would combine with potentially incorrect stacks, leading to an unacceptable compounding of errors. Therefore, I would argue that memory profilers would require efficient, in-process stack-walking logic.

It's hard or even impossible to write a reliable Python in-process debugger/profiler.

My understanding is that the proposed changes are not addressing the general (and hard) problem of adding an API for any kind of profilers/debuggers. The main scope of the proposed API is to provide efficient utilities that can be used for memory profiling. When we specialise the problem to just this case, I believe there are many simplifying assumptions that can be made. For once, grabbing a stacktrace from a hook triggered by a memory allocation will run effectively in-line with the allocation. This pins the call stack of the thread the unwinder is running on, meaning that one can assume that internal structures like frames, thread states, and everything else related to the thread are not going to mutate under the unwinder's feet. Surely some degree of caution is still required, but the problem is much more tractable than the general case.

But landing it here flips the direction of the maintenance burden

I agree that the burden of maintaining the new API to serve tools should not fall on CPython in general. However, my view is that, if we can keep the changes to a minimum and the new API surface as lean as possible, the trade-off would be that of potentially giving CPython an efficient built-in memory profiler that could well complement the newly introduced sampling profiler.

An out-of-process debugger/profiler is more reliable since it gets an error instead of a stack fault when reading invalid memory locations.

This is probably off-topic, but there is nothing preventing an in-process tool from using the same out-of-process techniques to read memory safely. Of course the shared resources in this case imply e.g. a different performance profile, but the point is that there are techniques that can be used to make in-process tools almost as safe as out-of-process ones.

pythongh-148925: Add PyUnstable_CollectCallStack and PyUnstable_Print…

a950a1e

…CallStack

danielsn requested review from a team, AA-Turner, ZeroIntensity, emmatyping and erlend-aasland as code owners April 23, 2026 19:53

bedevere-app Bot added the awaiting review label Apr 23, 2026

bedevere-app Bot mentioned this pull request Apr 23, 2026

Expose signal-safe way to get a structured stacktrace #148925

Open

danielsn commented Apr 23, 2026

View reviewed changes

Comment thread Include/cpython/traceback.h Outdated

danielsn commented Apr 23, 2026

View reviewed changes

Comment thread Include/cpython/traceback.h Outdated

pablogsal self-assigned this Apr 29, 2026

KowalskiThomas reviewed Apr 30, 2026

View reviewed changes

Comment thread Python/traceback.c Outdated

Comment thread Doc/c-api/traceback.rst Outdated

Comment thread Python/traceback.c Outdated

pablogsal requested changes Apr 30, 2026

View reviewed changes

bedevere-app Bot removed the awaiting review label Apr 30, 2026

bedevere-app Bot added the awaiting changes label Apr 30, 2026

danielsn added 2 commits May 11, 2026 16:39

Undo changes in preperation for a simpler API.

c56f335

danielsn requested a review from markshannon as a code owner May 11, 2026 20:41

danielsn commented May 11, 2026

View reviewed changes

danielsn changed the title ~~gh-148925: Add PyUnstable_CollectCallStack and PyUnstable_PrintCallStack~~ gh-148925: Add signal-safe PyUnstable_* APIs for call-stack iteration May 11, 2026

danielsn changed the title ~~gh-148925: Add signal-safe PyUnstable_* APIs for call-stack iteration~~ gh-148925: Add safe PyUnstable_* APIs for call-stack iteration May 11, 2026

ensure we get bytecode safely as well

671d19b

a little more code cleanup

774b1c3

danielsn commented May 12, 2026

View reviewed changes

danielsn added 3 commits May 26, 2026 14:32

clarify the names and intended uses of the functions

e672616

Merge branch 'main' into dsn/c-stracktrace

35f9148

small cleanups

b725a08

		.. versionadded:: 3.15


		.. c:function:: struct _PyInterpreterFrame* PyUnstable_ThreadState_GetInterpreterFrame(PyThreadState *tstate)

Uh oh!

Conversation

danielsn commented Apr 23, 2026 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

python-cla-bot Bot commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

pablogsal commented Apr 29, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

pablogsal left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bedevere-app Bot commented Apr 30, 2026

Uh oh!

pablogsal commented Apr 30, 2026

Uh oh!

vstinner commented Apr 30, 2026

Uh oh!

danielsn commented May 7, 2026

Uh oh!

danielsn May 11, 2026

Choose a reason for hiding this comment

Uh oh!

read-the-docs-community Bot commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Documentation build overview

Uh oh!

danielsn commented May 11, 2026

Uh oh!

danielsn May 12, 2026

Choose a reason for hiding this comment

Uh oh!

vstinner commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

danielsn commented May 19, 2026

Uh oh!

danielsn commented May 26, 2026

Uh oh!

vstinner commented May 28, 2026

Uh oh!

P403n1x87 commented Jun 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

danielsn commented Apr 23, 2026 •

edited by github-actions Bot

Loading

python-cla-bot Bot commented Apr 23, 2026 •

edited

Loading

pablogsal left a comment •

edited

Loading

read-the-docs-community Bot commented May 11, 2026 •

edited

Loading

vstinner commented May 13, 2026 •

edited

Loading