Store file cache entries with serialize() instead of var_export/include#5845
Store file cache entries with serialize() instead of var_export/include#5845SanderMuller wants to merge 1 commit into
Conversation
Loading a cache entry no longer pays PHP parse + AST-to-value cost for every hit (opcache is typically off on CLI); unserialize() of the same data is measurably cheaper across the thousands of per-file reflection and PHPDoc cache entries a cold run touches (-2% cold CPU, -4.7% in the ablation). Entries written in the old format fail to unserialize and count as a one-time cache miss. clearUnusedFiles() now recognises the serialized format - its keep predicate is built from CacheItem::class - and CACHED_CLEARED_VERSION is bumped so legacy var_export entries are purged once after upgrade. Without that, a missing or stale cache-cleared marker would have made cleanup delete every current-format entry as legacy garbage. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
|
I don't think it will be faster. Also with opcache enabled, reading the files is essentially free. |
|
One thing that passing size from the outside will allow us to do later is to construct true LoC number to be analysed. You can have a small file that uses 5 huge traits, would be nice if it's striped between the largest files to be analysed. But this PR doesn't have to do it now. |
|
Thanks for the feedback, I will look into it more carefully and get back to you! |
|
Measured it on the full
So you're right — wall time is a wash. User CPU comes out 2–4% lower with serialize (one-shot worker processes still pay the compile on every include; opcache SHM doesn't outlive them), but that's not enough to justify changing the cache format. Closing — thanks for pushing me to measure the real workload. The LoC-weighted striping idea is a good one; that gets easy once the size callback from #5844 is in. |
What & why
FileCacheStoragestores entries as var_export'd PHP files loaded viainclude. On CLI,where opcache is typically off, every cache hit pays full PHP parse plus AST-to-value
construction. A cold run loads thousands of per-file reflection and PHPDoc entries this
way. Storing the
CacheItemwithserialize()and reading it back withunserialize()is measurably cheaper: about 2% less total CPU on a cold
src/Typeself-analysis run(up to 4.7% in an ablation on a larger change set), and the entries shrink on disk.
Migration details, since a format switch deserves scrutiny:
.datfiles. An older PHPStan version sharing thesame tmpDir looks for
.phpfiles, finds none, and treats it as a plain cache miss.Writing the new format into
.phpfiles would be worse than a miss:includeof afile without a
<?phptag echoes its raw bytes to stdout.CacheIteminstance check on load andcount as a one-time miss; the cache rebuilds from there.
clearUnusedFiles()now keeps current-format files (the predicate is built fromCacheItem::class) andCACHED_CLEARED_VERSIONis bumped, so leftover var_exportentries from before the switch get purged once. Previously the keep-predicate matched
only the old format, which meant a missing or stale marker file would have deleted
every current entry as legacy garbage.
FileCacheStorageTestcovers the round-trip, that cleanup keeps current-format entrieswhen the marker file is missing, and that legacy
.phpentries get removed.Tests
make tests(12,714, green),make phpstan,make cs,make lintandmake composer-dependency-analyserall pass. Analysis output is byte-identical to thebaseline.
🤖 Generated with Claude Code