Conditionally build page pruning predicates#21480
Conversation
Page pruning predicates in the Parquet opener are constructed regardless of whether enable_page_index is set. Under high query load, this uses significant CPU time although these predicates are created and discarded quickly. This commit reorders the predicate creation flow to only construct page pruning predicates if enable_page_index is enabled. Regular predicates are created always as before.
cfcc878 to
e5905e2
Compare
alamb
left a comment
There was a problem hiding this comment.
Thank you @fpetkovski -- this is a good idea
Do you think there is any way to test this? otherwise we run a real danger of regressing performance under refactors (for example the one I am working on now with #21327
Though it is not clear to me what such a test would look like / how to do it without causing non trivial churn 🤔
|
This removes a redundant step, and the change looks reasonable. However, I’m curious how this step ends up consuming significant CPU time — building a predicate per file doesn’t seem very expensive. Do you have a micro-benchmark that demonstrates the impact on a representative workload? |
|
run benchmarks |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing optimize-pruning-predicate (e5905e2) to 91c2e04 (merge-base) diff using: tpcds File an issue against this benchmark runner |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing optimize-pruning-predicate (e5905e2) to 91c2e04 (merge-base) diff using: tpch File an issue against this benchmark runner |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing optimize-pruning-predicate (e5905e2) to 91c2e04 (merge-base) diff using: clickbench_partitioned File an issue against this benchmark runner |
|
|
||
| // Build predicates for this specific file | ||
| let (pruning_predicate, page_pruning_predicate) = build_pruning_predicates( | ||
| let pruning_predicate = build_pruning_predicates( |
There was a problem hiding this comment.
Maybe better is to delay it so they are only created after we loaded the page index, this avoids doing it as well for files without page index (and avoids it in short-circuit scenarios / query cancellation).
There was a problem hiding this comment.
I can help do this as a follow on PR too
There was a problem hiding this comment.
I looked into this as part of
it turns out that there is some trade off between creating the page pruning predicate unecessairly and loading the page index unecessairly (e.g. if the predicate will not create a pruning predicate).
I'll comment on the other PR
There was a problem hiding this comment.
It is actually pretty tricky to avoid the pruning predicate/ page indexes in general
I think @adriangb 's suggestion to cache them is probably a more effectve
|
Thanks everyone for the review. I will address comments and in the meantime I have attached a profile screenshot from our production in the PR description. We query many small files very frequently (thousands of QPS), which are already pre-filtered outside of datafusion so pruning predicates have little effect. As can be seen, half of query time is spent opening files. |
|
🤖 Benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usagetpcds — base (merge-base)
tpcds — branch
File an issue against this benchmark runner |
|
🤖 Benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usageclickbench_partitioned — base (merge-base)
clickbench_partitioned — branch
File an issue against this benchmark runner |
|
Maybe the predicate is complex 🤔 |
|
|
||
| // Build predicates for this specific file | ||
| let (pruning_predicate, page_pruning_predicate) = build_pruning_predicates( | ||
| let pruning_predicate = build_pruning_predicates( |
There was a problem hiding this comment.
I can help do this as a follow on PR too
|
We need to clean up CI and ideally find some sort of tests for this, but I think it is an improvement over main -- thank you @fpetkovski |
|
One thing I wonder is if we can avoid per-file rewrites, pruning predicate creation, etc. by caching these things. In most systems you might have 2-3 different physical schemas (different writers, etc.) but often times just 1 so re-doing the exact same work across many files is wasteful. |
|
I've added a test which should verify the pruning functionality. When we disable the page index we get back all the rows, with the index enabled we only get back the page which matches the predicate. |
alamb
left a comment
There was a problem hiding this comment.
I think the code looks good to me - this PR has a few proposed follow ons which I will try and file tomorrow if no one beats me to it
@2010YOUY01 roughly 10-20% of the flamegraph in the PR description is building the page pruning predicate
|
Filed a ticket to track this idea |
|
Thanks again everyone. |
@fpetkovski -- if you have many files with similar schema, optimzing this more via #21554 may help substantially |

Rationale for this change
Page pruning predicates in the Parquet opener are constructed regardless of whether enable_page_index is set. Under high query load, this uses significant CPU time although these predicates are created and discarded quickly.
Which issue does this PR close?
What changes are included in this PR?
This commit reorders the predicate creation flow to only construct page pruning predicates if enable_page_index is enabled. Regular predicates are created always as before.
Are these changes tested?
I am relying on unit tests but I can do manual testing with a debugger.
Are there any user-facing changes?
Changes are are optimization and are not user-facing.