SIGSEGV in thread-sampling transaction profiler under concurrent HTTP load (Python 3.11, sentry-sdk 2.58.0)

## How do you use Sentry?
Sentry SaaS (sentry.io)

## Version
2.58.0 (also reproduced on 2.17.0)

## Steps to Reproduce
1. Run a FastAPI app under `uvicorn` on Python 3.11, containerised (Linux, x86_64), with many concurrent HTTP requests.
2. Each request performs synchronous outbound I/O (e.g. `google-cloud-storage` `blob.download_as_bytes`, arbitrary `requests.Session.post` / `get`) from an `anyio` worker thread (standard FastAPI `run_in_threadpool` pattern).
3. `sentry_sdk.init(dsn=..., traces_sampler=..., profiles_sample_rate=1.0)` — any non-zero `profiles_sample_rate` is sufficient to trigger.
4. Wait ~seconds under steady load.

## Expected Result
No process-level crash from the profiler.

## Actual Result
Worker process dies with `SIGSEGV`. Fault address is a small integer (observed: 0, 1, 2, 7, 8, 68, 72, 80, 111, 228) — classic use-after-free / null-deref-with-offset. Multiple different threads crash across different incidents; every crash has the profiler thread actively sampling frames at the moment of death.

With `PYTHONFAULTHANDLER=1` enabled, faulthandler consistently shows the pattern:

**Sibling thread — profiler** (running at the moment of crash, every time):

```
File ".../sentry_sdk/profiler/transaction_profiler.py", line 711, in run
File ".../sentry_sdk/profiler/transaction_profiler.py", line 601, in _sample_stack
File ".../sentry_sdk/profiler/transaction_profiler.py", line 602, in <listcomp>
File ".../sentry_sdk/profiler/utils.py", line 167, in extract_stack
File ".../sentry_sdk/profiler/utils.py", line 167, in <genexpr>
File ".../sentry_sdk/profiler/utils.py", line 114, in frame_id
```

**Current thread — mid-HTTP I/O via a Sentry stdlib patch** (one representative stack; the exact app code above the stdlib patch varies, but the Sentry patch frame is always present):

```
File "/usr/local/lib/python3.11/ssl.py", line 1166, in read
File "/usr/local/lib/python3.11/ssl.py", line 1314, in recv_into
File "/usr/local/lib/python3.11/socket.py", line 718, in readinto
File "/usr/local/lib/python3.11/http/client.py", line 291, in _read_status
File "/usr/local/lib/python3.11/http/client.py", line 330, in begin
File "/usr/local/lib/python3.11/http/client.py", line 1415, in getresponse
File ".../sentry_sdk/integrations/stdlib.py", line 146, in getresponse       <-- Sentry patch
File ".../urllib3/connection.py", line 571, in getresponse
File ".../urllib3/connectionpool.py", line 534, in _make_request
File ".../urllib3/connectionpool.py", line 787, in urlopen
File ".../requests/adapters.py", line 644, in send
File ".../opentelemetry_instrumentation_requests/__init__.py", line 432, in instrumented_send
File ".../requests/sessions.py", line 703, in send
File ".../requests/sessions.py", line 589, in request
File ".../google/auth/transport/requests.py", line 543, in request
File ".../google/cloud/storage/_media/requests/download.py", line 253, in retriable_request
File ".../google/api_core/retry/retry_unary.py", line 147, in retry_target
File ".../google/cloud/storage/blob.py", line 1094, in _do_download
File ".../google/cloud/storage/blob.py", line 1530, in download_as_bytes
File ".../google/cloud/storage/blob.py", line 1651, in download_as_string
File ".../<app>/handler.py", in <app_handler>
File ".../starlette/concurrency.py", line 42, in run_in_threadpool
File ".../anyio/to_thread.py", line 63, in run_sync
File ".../anyio/_backends/_asyncio.py", line 1002, in run
File ".../sentry_sdk/integrations/fastapi.py", line 90, in _sentry_call
File ".../sentry_sdk/integrations/threading.py", line 133, in _run_old_run_func
File ".../sentry_sdk/integrations/threading.py", line 140, in run
File "/usr/local/lib/python3.11/threading.py", line 1045, in _bootstrap_inner
File "/usr/local/lib/python3.11/threading.py", line 1002, in _bootstrap
```

Previously observed variant: same profiler sibling stack, but the crashing frame was in `tracing.Span.__init__ → uuid.uuid4()`, triggered via the `sentry_sdk/integrations/stdlib.py:91` `putrequest` patch instead of `getresponse`. Both stdlib patch points reproduce.

## Analysis

The profiler thread in `transaction_profiler.py` samples `PyFrameObject`s of all other threads at the configured frequency, via `extract_stack` → `frame_id` (`utils.py:114`). `frame_id` reads fields from a `PyFrameObject` that another thread may be actively mutating/freeing. There is no GIL-level synchronisation across the sample boundary — the sampler is scheduled cooperatively with mutator threads, but the C-level attribute reads inside `frame_id` can see a partially-freed object if the mutator releases/reclaims a frame or code object mid-sample.

The signature (always tiny `fault_addr`, always in `frame_id`/`extract_stack` while a mutator is mid-Span construction around HTTP I/O) is consistent with that race. The Sentry `stdlib.putrequest` / `getresponse` patches are a frequent entry point because every outbound HTTP call constructs a new `Span` → allocates a `uuid4()` → lots of short-lived frame/code churn right in the profiler's sampling window.

## Workarounds
- `profiles_sample_rate=0.0` — stops the profiler thread entirely. Confirmed to eliminate crashes.
- `profiles_sample_rate=0.1` (reduced from `1.0`) — reduces crash rate proportionally but does **not** eliminate it. A single sampled transaction hitting the race is enough to kill the container.
- Upgrading `sentry-sdk` from 2.17.0 to 2.58.0 does **not** fix it. The legacy thread-sampling profiler is still used whenever `profiles_sample_rate > 0`; 2.24.1+ added the continuous profiler as an opt-in, it did not replace the transaction profiler.

## Ask
- Can the Sentry team confirm whether the thread-sampling transaction profiler is considered safe for production use on Python 3.11+ under concurrent HTTP load?
- If not, would the docs acknowledge this (it's currently presented as a general-purpose option)?
- Could `frame_id` / `extract_stack` be hardened against concurrent mutation, or is migration to the continuous profiler the official path forward?

## Environment
- OS: Linux (Debian slim base, x86_64), managed container platform
- Python: 3.11 (CPython, stock)
- Runtime: FastAPI 0.114, Starlette 0.37, uvicorn 0.34, anyio 4.13
- Third-party in-stack at crash: `requests` 2.32, `urllib3` 2.6, `google-cloud-storage` 3.10, `google-auth` 2.49, `opentelemetry-instrumentation-requests` 0.62b0
- `PYTHONFAULTHANDLER=1` to capture the trace; without it, crash is logged only as "Container terminated on signal 11"

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SIGSEGV in thread-sampling transaction profiler under concurrent HTTP load (Python 3.11, sentry-sdk 2.58.0) #6119

How do you use Sentry?

Version

Steps to Reproduce

Expected Result

Actual Result

Analysis

Workarounds

Ask

Environment

Metadata

Assignees

Labels

Fields

Projects

Milestone

Relationships

Development

SIGSEGV in thread-sampling transaction profiler under concurrent HTTP load (Python 3.11, sentry-sdk 2.58.0) #6119

Description

How do you use Sentry?

Version

Steps to Reproduce

Expected Result

Actual Result

Analysis

Workarounds

Ask

Environment

Metadata

Metadata

Assignees

Labels

Fields

Projects

Milestone

Relationships

Development

Issue actions