Skip type erasure in trace_future/trace_block when no tracer is set#21414
Skip type erasure in trace_future/trace_block when no tracer is set#21414Dandandan wants to merge 1 commit intoapache:mainfrom
Conversation
When no custom JoinSetTracer is registered (the common case), the trace_future and trace_block functions were still performing unnecessary type erasure (boxing as Box<dyn Any + Send>) and downcast operations on every spawned task. Profiling ClickBench queries showed trace_future's type erasure overhead accounted for ~12% of total CPU time across all queries. This adds a fast path that checks GLOBAL_TRACER.get().is_none() and returns the future/closure directly without the erase/trace/downcast pipeline when no custom tracer is configured. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
run benchmarks |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing optimize-trace-future (45b0cd9) to c17c87c (merge-base) diff using: tpcds File an issue against this benchmark runner |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing optimize-trace-future (45b0cd9) to c17c87c (merge-base) diff using: tpch File an issue against this benchmark runner |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing optimize-trace-future (45b0cd9) to c17c87c (merge-base) diff using: clickbench_partitioned File an issue against this benchmark runner |
|
🤖 Benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usagetpcds — base (merge-base)
tpcds — branch
File an issue against this benchmark runner |
|
🤖 Benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usageclickbench_partitioned — base (merge-base)
clickbench_partitioned — branch
File an issue against this benchmark runner |
Summary
trace_futureandtrace_blockthat skips theBox<dyn Any + Send>type erasure and downcast pipeline when no customJoinSetTraceris registered (the common case)GLOBAL_TRACERis not set, returns the future/closure directly with minimal boxing overheadBackground
Profiling ClickBench queries showed
trace_future's type erasure overhead accounted for ~12% of total CPU time. Even with theNoopTracer, every spawned task was paying the cost of:Box<dyn Any + Send>(type erasure)Box<dyn Any + Send>toT.boxed()Benchmark results (ClickBench, single iteration)
Notable improvements:
Test plan
cargo test -p datafusion-common-runtimepassescargo clippy -p datafusion-common-runtime --all-targets --all-features -- -D warningsclean🤖 Generated with Claude Code