Skip to content

Speed up Graph push/pull per-frame paths#2316

Merged
WyattBlue merged 1 commit into
mainfrom
filter-graph-perf
Jun 16, 2026
Merged

Speed up Graph push/pull per-frame paths#2316
WyattBlue merged 1 commit into
mainfrom
filter-graph-perf

Conversation

@WyattBlue

Copy link
Copy Markdown
Member

What

Two internal optimizations to the filter Graph per-frame hot paths:

  1. FilterContext caches a one-byte filter kind (source / video sink / audio sink / other) at wrap time, replacing the per-frame self.filter.name in (...) / == "buffersink" checks. Those re-converted filter.name (a C string) to a fresh Python str and ran tuple-membership tests on every pushed/pulled frame.
  2. Graph caches the buffer/abuffer source contexts (_video_sources / _audio_sources), so push/vpush index a list attribute directly instead of doing a _context_by_type dict lookup per call (and the None flush no longer allocates a concatenated list on the per-frame path).

Why

The per-frame Python/Cython glue is dwarfed by the actual ffmpeg filtering (done under nogil), so this won't move the needle on full-size-frame workloads — but it's a clean win for high-throughput cases (many small frames, audio, trivial filters).

Impact

No API or behavior change — purely internal.

On a trivial buffer -> buffersink graph (64×64 frames), the push+pull round trip drops from ~2.0 µs/frame (~500K fps) to ~1.25 µs/frame (~800K fps).

Testing

  • Built against the restricted (CI allowlist) FFmpeg 8.1.1 build.
  • Full test suite: 443 passed, 40 skipped.
  • make lint clean.

Cache a one-byte filter kind on FilterContext (source / video sink /
audio sink / other) at wrap time, so push/pull no longer reconvert
filter.name (a C string) to a Python str and do tuple-membership tests
on every frame.

Cache the buffer/abuffer source contexts on Graph so push/vpush index a
list attribute directly instead of looking them up in _context_by_type
each call.

Pure internal change, no API or behavior difference. On a trivial
buffer->buffersink graph the push+pull round trip drops from ~2.0 to
~1.25 us/frame.
@WyattBlue WyattBlue force-pushed the filter-graph-perf branch from c2df65c to 8445fd1 Compare June 16, 2026 05:36
@WyattBlue WyattBlue merged commit d9d12c5 into main Jun 16, 2026
8 checks passed
@WyattBlue WyattBlue deleted the filter-graph-perf branch June 16, 2026 05:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant