-
Notifications
You must be signed in to change notification settings - Fork 131
Improve visitor performance by reducing Vec and String allocations
#6337
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Benchmarks: Random AccessSummary
|
Benchmarks: TPC-H SF=1 on NVMESummary
Detailed Results Table
|
Benchmarks: FineWeb NVMeSummary
Detailed Results Table
|
Benchmarks: TPC-DS SF=1 on NVMESummary
Detailed Results Table
|
Benchmarks: TPC-H SF=1 on S3Summary
Detailed Results Table
|
Benchmarks: TPC-H SF=10 on NVMESummary
Detailed Results Table
|
Benchmarks: FineWeb S3Summary
Detailed Results Table
|
Benchmarks: Statistical and Population GeneticsSummary
Detailed Results Table
|
🚨🚨🚨❌❌❌ SQL BENCHMARK FAILED ❌❌❌🚨🚨🚨Benchmark |
Benchmarks: Clickbench on NVMESummary
Detailed Results Table
|
Benchmarks: CompressionSummary
Detailed Results Table
|
Merging this PR will degrade performance by 16.58%
Performance Changes
Comparing Footnotes
|
c235d20 to
531a071
Compare
Benchmarks: PolarSignals ProfilingSummary
Detailed Results Table
|
0fda89b to
015b5d7
Compare
|
moving the filter piece into a different PR so I can try and measure a bigger thing |
1d9c33f to
8172fcb
Compare
|
The codspeed improvement here is real 🥳, like 20% of those benchmarks was just formatting the names of the chunks. |
|
@gatesn following our conversation on Friday, are you going to break these APIs and I should just close this PR? |
Vec and String allocations
|
I will take a look. Very keen to take these performance wins, just want to make sure it moves in the direction we were going to avoid extra work. |
Signed-off-by: Adam Gutglick <[email protected]>
Signed-off-by: Adam Gutglick <[email protected]>
Signed-off-by: Adam Gutglick <[email protected]>
Signed-off-by: Adam Gutglick <[email protected]>
Signed-off-by: Adam Gutglick <[email protected]>
Signed-off-by: Adam Gutglick <[email protected]>
Signed-off-by: Adam Gutglick <[email protected]>
Signed-off-by: Adam Gutglick <[email protected]>
9f62be2 to
d1af20c
Compare
joseph-isaacs
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good and is very easy to unwind if needed
Using
dhat, identified these two code paths as heavy allocators, running the full tpch benchmark.This PR fills out many default implementations in
VisitorVTable, making them much more efficient for many cases, both by reducing recursive function calls and by not formatting names. As far as I can tell - many of the improvements shown by codspeed are completely real here.Using the existing iterator when filtering instead of allocating a Vec saves about 2GB of memory allocation (around 1.6% of all memory allocated), which happens in a tiny amount of allocations (0.02%).Moving to a separate PR, I want to try and make a bigger change and be able to measure it.nth_child(should just bechild?) was just another offender that was easy to experiment with, it doesn't allocate a lot of data but it does save ~0.6% of all allocations.