fix(instrumentation): langchain association properties context cleanup#4055
fix(instrumentation): langchain association properties context cleanup#4055kamilamin123 wants to merge 4 commits intotraceloop:mainfrom
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughAdds tracking and cleanup for OpenTelemetry Changes
Sequence Diagram(s)sequenceDiagram
participant Handler as TraceloopCallbackHandler
participant SpanHolder as SpanHolder
participant ContextAPI as context_api
participant Span as Span
rect rgba(200,200,255,0.5)
Handler->>SpanHolder: request/create span (run_id, metadata)
end
rect rgba(200,255,200,0.5)
Handler->>ContextAPI: attach(association_properties=...)
ContextAPI-->>Handler: association_properties_token
Handler->>SpanHolder: store association_properties_token
end
rect rgba(255,200,200,0.5)
Handler->>Span: end span
Handler->>SpanHolder: detach association_properties_token (and span token)
ContextAPI-->>Handler: detached
end
rect rgba(255,255,200,0.5)
Handler->>SpanHolder: replace existing holder for same run_id
Handler->>ContextAPI: detach previous association_properties_token (if any)
ContextAPI-->>Handler: detached
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (2)
packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py (1)
208-217: Pre-existing: child holders are not cleaned up here.Outside the scope of this PR, but worth noting: when
_end_spanrecursively ends child spans (lines 209–213), it does not call_detach_holder_contextson them ordelthem fromself.spans. If a parent ends before its children, childtoken/association_properties_tokenwill continue to leak. Consider a follow-up to apply the same lifecycle handling to child holders.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py` around lines 208 - 217, The _end_span method currently ends child spans (in the loop over self.spans[run_id].children) but does not call _detach_holder_contexts nor remove the child entries from self.spans, which can leak child holder tokens (e.g., token and association_properties_token); update _end_span so that for each child_id you: retrieve the child holder entry (self.spans[child_id]), call self._detach_holder_contexts(...) for that child holder, end the child_span only if not already ended, and then delete self.spans[child_id] before continuing, preserving the existing behavior for the parent span and final cleanup of self.spans[run_id].packages/opentelemetry-instrumentation-langchain/tests/test_context_token_lifecycle.py (1)
202-235: Good duplicate-run_id replacement coverage; consider adding the LLM variant.This nicely covers
_create_spanreplacement semantics. Optional follow-up: add a parallel test that exercises_create_llm_spantwice with the samerun_idandmetadata=set, asserting that both suppression andassociation_propertiesare cleared after_end_span. Currentlytest_duplicate_run_id_leaks_suppression_tokenonly checks suppression, andtest_duplicate_run_id_replaces_association_propertiesonly goes through_create_span, so the combined LLM-with-metadata duplicate path isn't directly asserted.Note:
first_spanhere is intentionally never.end()-ed to focus the assertion on context-token cleanup; a brief inline comment to that effect (beyond the existing "sanity" assertion) would prevent future readers from "fixing" it.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/opentelemetry-instrumentation-langchain/tests/test_context_token_lifecycle.py` around lines 202 - 235, Add a new test mirroring test_duplicate_run_id_replaces_association_properties but using the handler._create_llm_span API: call handler._create_llm_span twice with the same run_id and different metadata, assert the first LLM span remains open (add a short inline comment that the first span is intentionally not ended), verify that association metadata from the first span is cleared after creating the replacement (via _association_properties()), verify suppression-token behavior as in test_duplicate_run_id_leaks_suppression_token, then call handler._end_span on the second span and assert _association_properties() is empty and suppression is cleared; reference handler._create_llm_span, handler._end_span, _association_properties, and test_duplicate_run_id_leaks_suppression_token when locating where to add this test.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In
`@packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py`:
- Around line 208-217: The _end_span method currently ends child spans (in the
loop over self.spans[run_id].children) but does not call _detach_holder_contexts
nor remove the child entries from self.spans, which can leak child holder tokens
(e.g., token and association_properties_token); update _end_span so that for
each child_id you: retrieve the child holder entry (self.spans[child_id]), call
self._detach_holder_contexts(...) for that child holder, end the child_span only
if not already ended, and then delete self.spans[child_id] before continuing,
preserving the existing behavior for the parent span and final cleanup of
self.spans[run_id].
In
`@packages/opentelemetry-instrumentation-langchain/tests/test_context_token_lifecycle.py`:
- Around line 202-235: Add a new test mirroring
test_duplicate_run_id_replaces_association_properties but using the
handler._create_llm_span API: call handler._create_llm_span twice with the same
run_id and different metadata, assert the first LLM span remains open (add a
short inline comment that the first span is intentionally not ended), verify
that association metadata from the first span is cleared after creating the
replacement (via _association_properties()), verify suppression-token behavior
as in test_duplicate_run_id_leaks_suppression_token, then call handler._end_span
on the second span and assert _association_properties() is empty and suppression
is cleared; reference handler._create_llm_span, handler._end_span,
_association_properties, and test_duplicate_run_id_leaks_suppression_token when
locating where to add this test.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 315a3ecc-0694-4e9d-8238-30c3281b8542
📒 Files selected for processing (3)
packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.pypackages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/span_utils.pypackages/opentelemetry-instrumentation-langchain/tests/test_context_token_lifecycle.py
There was a problem hiding this comment.
🧹 Nitpick comments (2)
packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py (2)
208-222: Minor: detach-vs-end ordering is asymmetric between children and parent.For each child, tokens are detached before
child_span.end(); for the parent (run_id), tokens are detached afterspan.end(). Functionally equivalent here (span finalization doesn't depend on the active OTel context), but the asymmetry is easy to misread. Consider detaching the parent's tokens beforespan.end()for symmetry, or document the intent.♻️ Optional consistency tweak
- if span.end_time is None: # avoid warning on ended spans - span.end() - self._detach_holder_contexts(self.spans[run_id]) + self._detach_holder_contexts(self.spans[run_id]) + if span.end_time is None: # avoid warning on ended spans + span.end()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py` around lines 208 - 222, The _end_span method detaches context tokens for children before ending their spans but detaches the parent's tokens after calling span.end(), creating an asymmetric ordering; update _end_span to call self._detach_holder_contexts(self.spans[run_id]) before calling span.end() (matching the child flow) so the parent span's contexts are detached prior to span.end(), keeping the logic consistent for run_id, span, and using the existing _detach_holder_contexts and spans structures.
466-471: Stale comment — the described action no longer happens here.The comment claims the old holder is captured and detached before
_create_spanruns, but no such capture/detach exists at this call site anymore — that responsibility has been moved into_create_spanvia the new_detach_holder_contexts(old_holder)block. As written, the comment will mislead future maintainers into expecting (or re-adding) detach logic here.📝 Suggested rewrite
- # Capture and detach the old holder BEFORE _create_span runs. - # _create_span unconditionally overwrites self.spans[run_id], which - # would make the old SpanHolder (and its suppression token) unreachable. - # Detaching here, before the overwrite, ensures we clean up the original - # suppression token from a duplicate run_id and that the new suppression - # is layered on top of the clean baseline context. + # _create_span detaches any prior holder's tokens (including a stale + # suppression token from a duplicate run_id) before overwriting + # self.spans[run_id], so the new suppression below layers on top of a + # clean baseline context. span = self._create_span(🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py` around lines 466 - 471, The existing comment is stale: it says the old SpanHolder is captured and detached before _create_span runs, but that logic was moved into _create_span via the new _detach_holder_contexts(old_holder) call; update the comment around the call site so it no longer claims a pre-detach occurs here, instead succinctly noting that _create_span is responsible for detaching any existing SpanHolder (via _detach_holder_contexts) and that it will overwrite self.spans[run_id], so callers should rely on _create_span to manage suppression tokens and context cleanup; reference SpanHolder, _create_span, _detach_holder_contexts, self.spans, and run_id in the revised comment.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In
`@packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py`:
- Around line 208-222: The _end_span method detaches context tokens for children
before ending their spans but detaches the parent's tokens after calling
span.end(), creating an asymmetric ordering; update _end_span to call
self._detach_holder_contexts(self.spans[run_id]) before calling span.end()
(matching the child flow) so the parent span's contexts are detached prior to
span.end(), keeping the logic consistent for run_id, span, and using the
existing _detach_holder_contexts and spans structures.
- Around line 466-471: The existing comment is stale: it says the old SpanHolder
is captured and detached before _create_span runs, but that logic was moved into
_create_span via the new _detach_holder_contexts(old_holder) call; update the
comment around the call site so it no longer claims a pre-detach occurs here,
instead succinctly noting that _create_span is responsible for detaching any
existing SpanHolder (via _detach_holder_contexts) and that it will overwrite
self.spans[run_id], so callers should rely on _create_span to manage suppression
tokens and context cleanup; reference SpanHolder, _create_span,
_detach_holder_contexts, self.spans, and run_id in the revised comment.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 2c03cec5-b12a-4049-9f43-c2796eeb1fd4
📒 Files selected for processing (2)
packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.pypackages/opentelemetry-instrumentation-langchain/tests/test_context_token_lifecycle.py
🚧 Files skipped from review as they are similar to previous changes (1)
- packages/opentelemetry-instrumentation-langchain/tests/test_context_token_lifecycle.py
There was a problem hiding this comment.
🧹 Nitpick comments (2)
packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py (2)
198-222: Helper +_end_spancleanup logic looks correct.LIFO order is preserved for the common case:
_create_spanattachesassociation_properties_tokenfirst (line 298) then the span context token (line 336), so detachingtokenbeforeassociation_properties_tokenin_detach_holder_contextsis the right order for both non-LLM and LLM spans (wheretokenbecomes the suppression token attached on top ofassociation_properties_token). Children are also drained before the parent, which keeps the context stack consistent with the attach order under normal nesting.One minor hygiene note:
_detach_holder_contextsdoesn't null outspan_holder.token/span_holder.association_properties_tokenafter a successful detach. Today every call site either deletes the holder or overwrites it immediately, so this is safe — but if a future change ever re-invokes the helper on the same holder, the same tokens would be detached twice._safe_detach_contextswallows the resulting error, so it's just defensive cleanup, not a correctness issue.♻️ Optional: clear tokens after detach to make the helper idempotent
def _detach_holder_contexts(self, span_holder: SpanHolder | None) -> None: """Detach any tracked context tokens stored on a span holder.""" if span_holder is None: return if span_holder.token: self._safe_detach_context(span_holder.token) + span_holder.token = None if span_holder.association_properties_token: self._safe_detach_context(span_holder.association_properties_token) + span_holder.association_properties_token = None🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py` around lines 198 - 222, The helper _detach_holder_contexts may detach the same context token twice if called again on the same SpanHolder; after calling self._safe_detach_context for span_holder.token and span_holder.association_properties_token, set span_holder.token = None and span_holder.association_properties_token = None to clear the references and make _detach_holder_contexts idempotent; update the function in callback_handler.py where SpanHolder is handled (and ensure related flows in _end_span that delete holders still work) so subsequent calls won't attempt to detach already-cleared tokens.
495-526: Redundantcurrent_holder.token = None— the holder is replaced on the very next statement.At line 515 you overwrite
self.spans[run_id]with a brand-newSpanHolder, so mutatingcurrent_holder.tokentoNonefirst has no observable effect (and the only other reference,current_holder.association_properties_token, is read after this point, but thetokenfield isn't). It's harmless but a small footgun: it impliescurrent_holderkeeps living, when in fact it's discarded. Consider dropping the assignment for clarity.The rest of the swap (detach the span context token from
_create_span, attach suppression on top, then carry forward the freshly-attachedassociation_properties_tokeninto the replacement holder) is correct — the assoc-props token stays referenced exactly once so_end_spancan detach it later.♻️ Drop the dead mutation
current_holder = self.spans.get(run_id) if current_holder is not None and current_holder.token is not None: self._safe_detach_context(current_holder.token) - current_holder.token = None🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py` around lines 495 - 526, The assignment current_holder.token = None is redundant because you immediately replace self.spans[run_id] with a new SpanHolder; remove that line to avoid misleading mutation. In callback_handler.py inside the block that calls self._safe_detach_context(current_holder.token) and then constructs SpanHolder(..., association_properties_token=(current_holder.association_properties_token if current_holder else None), ...), delete the current_holder.token = None statement so only the detach occurs and the association_properties_token is still read from current_holder for the new SpanHolder (no other changes to _safe_detach_context or SpanHolder construction required).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In
`@packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py`:
- Around line 198-222: The helper _detach_holder_contexts may detach the same
context token twice if called again on the same SpanHolder; after calling
self._safe_detach_context for span_holder.token and
span_holder.association_properties_token, set span_holder.token = None and
span_holder.association_properties_token = None to clear the references and make
_detach_holder_contexts idempotent; update the function in callback_handler.py
where SpanHolder is handled (and ensure related flows in _end_span that delete
holders still work) so subsequent calls won't attempt to detach already-cleared
tokens.
- Around line 495-526: The assignment current_holder.token = None is redundant
because you immediately replace self.spans[run_id] with a new SpanHolder; remove
that line to avoid misleading mutation. In callback_handler.py inside the block
that calls self._safe_detach_context(current_holder.token) and then constructs
SpanHolder(...,
association_properties_token=(current_holder.association_properties_token if
current_holder else None), ...), delete the current_holder.token = None
statement so only the detach occurs and the association_properties_token is
still read from current_holder for the new SpanHolder (no other changes to
_safe_detach_context or SpanHolder construction required).
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: cca33ac5-457a-4bd8-be6f-f1ede9a7eef2
📒 Files selected for processing (1)
packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py
closes #4054
feat(instrumentation): ...orfix(instrumentation): ....What changed
association_propertiescontext attachment token on LangChainSpanHolderrun_id_create_llm_span()swaps the holder token to suppression contextrun_idreplacementWhy
Issue #4054 identified that
_create_span()attachesassociation_propertiesinto OpenTelemetry context without retaining a token to detach later. That can leave stale metadata in context after the span ends or when a holder is overwritten.Impact
This fixes a context leak in the LangChain instrumentation so downstream spans and logs do not inherit stale
association_propertiesfrom earlier runs.Root cause
_create_span()calledcontext_api.attach(context_api.set_value(...))forassociation_propertiesbut dropped the returned token. Since the token was never stored on the span holder, there was no matching detach during_end_span()or holder replacement.Validation
python3 -m compileall packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain packages/opentelemetry-instrumentation-langchain/tests/test_context_token_lifecycle.pyPYTHONPATH=packages/opentelemetry-instrumentation-langchain ./.venv/bin/python -m pytest -q --noconftest packages/opentelemetry-instrumentation-langchain/tests/test_context_token_lifecycle.py5 passedSummary by CodeRabbit