Stripe size-sorted files across parallel jobs#5844
Conversation
ondrejmirtes
left a comment
There was a problem hiding this comment.
I like this change, I wondered about it several times myself.
| 124, | ||
| 2, | ||
| [30, 30, 30, 30, 4], | ||
| [25, 25, 25, 25, 24], |
There was a problem hiding this comment.
I think the tests could be more expressive here. Could we supply differently sized files here, or maybe change the Scheduler signature so that we provide file sizes from the outside and the class is still unit testable?
There was a problem hiding this comment.
Done — scheduleWork() now takes a callable(string): int for the file sizes, so the class is unit-testable without touching the filesystem; both production callers pass fn ($file) => (int) @filesize($file). Added two expressive tests with synthetic sizes: one asserts the exact striped layout for six distinctly-sized files (the three heaviest files end up in three different jobs, each paired with a light file), the other checks every file is scheduled exactly once and no job exceeds jobSize for a larger irregular set. The existing count-based dataset tests now pass a constant-size callback.
| $numberOfJobs = (int) ceil(count($files) / $this->jobSize); | ||
| $stripedJobs = []; | ||
| foreach ($files as $i => $file) { | ||
| $stripedJobs[$i % $numberOfJobs][] = $file; | ||
| } | ||
|
|
||
| $jobs = array_values($stripedJobs); |
There was a problem hiding this comment.
could this last block be simplified to $jobs = array_chunk($files, $this->jobSize); again?
There was a problem hiding this comment.
It can't, unfortunately — that's the one shape I measured as actively harmful. With the list sorted by size, array_chunk() puts the ~20 heaviest files together in the first job, and whichever worker picks it up becomes a straggler the whole run waits for: sort + chunk measured +46% cold wall time vs. unsorted chunking on the src/Type self-analysis benchmark, while sort + round-robin striping is what produces the −21..23%. The striping is the point, not an artifact; the new testHeaviestFilesAreSpreadAcrossJobs test pins exactly this property.
Files were chunked into jobs in path order, so a job could end up with all of a directory's heaviest files and become the straggler the whole run waits for. Sorting by file size and dealing files round-robin across jobs gives every job the same size profile: -21..23% cold wall time on a 14-core machine, CPU unchanged. Sorting without striping makes it worse (+46% wall - the heavy files all land in the first job), which is why the striping is essential rather than cosmetic. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
eede67f to
b8e6fde
Compare
|
So what are hyperfine numbers for this one when running complete |
|
Complete
−1.47 s, 1.08× faster ( |
|
Thank you! |
|
looks like this PR regressed a e2e test: |
What & why
The scheduler chunks files into jobs in path order. Directories tend to cluster files of
similar weight, so one job can collect a directory's heaviest files and become the
straggler the whole run waits for while other workers sit idle.
This change sorts the files by size and deals them round-robin across the jobs, so every
job gets the same size profile. Job count and the per-job ceiling (
jobSize) stay thesame; only the composition changes. File sizes are supplied from the outside
(
callable(string): int), so the Scheduler itself stays unit-testable without touchingthe filesystem.
Benchmarks
Complete
make phpstanon phpstan-src (clears the result cache itself, so every run isa full analysis of 2,364 files), two clean worktrees with primed file caches,
hyperfine --warmup 1 --runs 6, Apple M4 Pro / PHP 8.5.7:make phpstanmake phpstan−1.47 s, 1.08× faster; user CPU unchanged within noise (111.3 s vs 113.3 s) — the win is
load balance, not less work. A reversed-order pass reproduces it (17.77 ± 0.24 s vs
19.04 ± 0.23 s), so it is not an ordering or thermal artifact.
One detail worth knowing: sorting without striping makes things considerably worse
(+46% wall on a smaller corpus), because plain chunking of a sorted list concentrates all
heavy files into the first job. The striping is what makes the sort pay off, and
testHeaviestFilesAreSpreadAcrossJobspins exactly that property.File size is a proxy for analysis cost, not a perfect predictor, but across repeated runs
it consistently beat path-order chunking and never lost to it.
Tests
make tests(12,713, green),make phpstan,make cs,make lintandmake composer-dependency-analyserall pass. Analysis output is byte-identical to thebaseline; only the distribution of files over worker jobs changes.
SchedulerTestcoversthe striped layout exactly (six distinctly-sized files: the three heaviest land in three
different jobs) and that every file is scheduled exactly once with jobs ≤ jobSize.
🤖 Generated with Claude Code