Unify cast handling by removing CastColumnExpr branches in pruning and ordering equivalence#21545
Open
kosiew wants to merge 4 commits intoapache:mainfrom
Open
Unify cast handling by removing CastColumnExpr branches in pruning and ordering equivalence#21545kosiew wants to merge 4 commits intoapache:mainfrom
CastColumnExpr branches in pruning and ordering equivalence#21545kosiew wants to merge 4 commits intoapache:mainfrom
Conversation
Eliminate CastColumnExpr compatibility in pruning logic. Consolidate equivalence cast substitution to handle only unified CastExpr. Remove redundant CastColumn-specific equivalence regression from dependency imports.
Rebuild unified CastExpr to preserve target field metadata using new_with_target_field(...). Update regression test in dependency.rs to utilize this field-aware CastExpr for better consistency and reliability in test cases.
Renamed the private helper `substitute_cast_like_ordering` to `substitute_cast_ordering`. Replaced the mutable substitution push loop with an iterator-based collection for better performance. Inlined single-use `arrow_schema` and `from_type` locals in cast pruning branches. Simplified the monotonic sort regression test by removing over-general `sort_columns`, collapsing one-element equality vectors, and using distinct case names for clarity.
Remove the unnecessary intermediate Vec in mod.rs. Implement rewrite_cast_child_to_prunable in pruning_predicate.rs and reuse it for both CastExpr and TryCastExpr to streamline casting operation handling.
Contributor
Author
|
@alamb |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Rationale for this change
Downstream components in DataFusion currently branch on both
CastExprandCastColumnExpr, even though they represent equivalent semantics after recent cast unification work. This duplication increases maintenance burden, introduces unnecessary complexity, and raises the risk of inconsistent behavior when evolving cast-related logic.This PR simplifies downstream logic by collapsing these dual branches into a single
CastExpr-based handling path, aligning with the broader cast unification effort.What changes are included in this PR?
Removed
CastColumnExprhandling in downstream logic:Eliminated branching specific to
CastColumnExprin:equivalence/properties/mod.rspruning/pruning_predicate.rsUnified cast substitution logic in ordering equivalence:
Replaced
substitute_cast_like_orderingwithsubstitute_cast_orderingSimplified implementation to only consider
CastExprPreserved correctness by ensuring:
Refactored pruning rewrite logic:
rewrite_cast_child_to_prunableCastExprandTryCastExprhandling to reuse the helperTest updates and cleanup:
CastExpr::new_with_target_fieldCastColumnExprMinor code cleanups:
Are these changes tested?
Yes.
Existing pruning and equivalence tests were updated to reflect the unified cast handling.
Obsolete tests relying on
CastColumnExprwere removed.Test cases were adjusted to ensure behavior remains unchanged, including:
These updates ensure no regression in behavior while validating the simplified implementation.
Are there any user-facing changes?
No.
This change is an internal refactor and does not modify user-facing APIs or query behavior. It preserves existing semantics while improving maintainability.
LLM-generated code disclosure
This PR includes LLM-generated code and comments. All LLM-generated content has been manually reviewed and tested.