Preserve logical cast field semantics during physical lowering with field-aware CastExpr#20836
Merged
kosiew merged 9 commits intoapache:mainfrom Apr 9, 2026
Merged
Preserve logical cast field semantics during physical lowering with field-aware CastExpr#20836kosiew merged 9 commits intoapache:mainfrom
kosiew merged 9 commits intoapache:mainfrom
Conversation
1058b83 to
df5486c
Compare
Contributor
Author
|
@adriangb |
Contributor
|
Interestingly I was just poking around here myself: #21390 |
Contributor
Author
Refactor logical Expr::Cast to use field-aware CastExpr, ensuring target FieldRef metadata is preserved. Enhance tests to confirm metadata retention, validate that same-type casts aren't elided for fields with semantics, and ensure existing TryCast rejection for extension types remains effective.
Expose shared cast_with_target_field helper to validate and build field-aware CastExprs in one place. Update planner to directly call this helper, removing the need for temporary type-only casts. Add regression tests to cover standard casts, metadata-bearing casts, and same-type semantic-preserving casts.
Consolidate default-target-field predicate and success construction path in cast_with_target_field to reduce duplicate code in cast.rs. Simplify tests in planner.rs by implementing shared setup helpers and caching return_field() results for standard casts.
Narrow cast_with_target_field from a public re-export to a crate-only re-export in datafusion/physical-expr/src/expressions/mod.rs. This change allows the planner to still utilize it while reducing the public expressions API surface.
Use as_planner_cast(...) helper in planner tests to eliminate repeated downcasting. Update cast_with_target_field documentation to clarify that default synthesized fields are elided while explicit field semantics are preserved.
- Refactored the handling of `Expr::Cast` to remove unnecessary line breaks and improve readability in the `create_physical_expr` function. - Modified the test for cast lowering to maintain consistent formatting while preserving target field metadata.
Collapse the nested if statements in datafusion/physical-expr/src/expressions/cast.rs to satisfy Clippy's collapsible_if lint. This change does not alter any existing behavior.
Contributor
Author
|
@adriangb |
adriangb
approved these changes
Apr 7, 2026
Contributor
adriangb
left a comment
There was a problem hiding this comment.
Nice work! Can we add SLT tests that would reflect these changes?
Introduce TypePlanner hook in test_context.rs for file-specific SLT to handle UUID target-field metadata. Added cast_extension_type_metadata.slt to cover new test cases for CAST and TRY_CAST with FixedSizeBinary. Ensure target metadata is preserved and acknowledge unchanged rejection path for TRY_CAST.
Eliminate the test_try_cast_to_extension_type_is_rejected from planner.rs as the new SLT now directly covers this case. This cleanup ensures better maintainability and reduces test duplication.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Rationale for this change
The current physical planning path for
Expr::Castdiscards logical field information (name, nullability, and metadata) by lowering casts using only the targetDataType. This results in a loss of semantic fidelity between logical and physical plans, particularly for metadata-bearing fields and same-type casts with explicit field intent.Additionally, the planner previously rejected casts with metadata due to limitations of the type-only casting API, creating inconsistencies with other parts of the system (e.g. adapter-generated expressions).
This change introduces a field-aware casting path that preserves logical intent throughout physical lowering, ensuring consistent semantics across planner and adapter outputs.
What changes are included in this PR?
cast_with_target_fieldto constructCastExprusing fullFieldRefsemantics (name, nullability, metadata).cast_with_optionsto delegate to the new field-aware helper.is_default_target_fieldto a shared helper function for reuse.planner.rs) to usecast_with_target_fieldinstead of type-only casting.cast_with_target_fieldfor internal planner use.Are these changes tested?
Yes.
Added planner-focused unit tests to validate:
TryCastThese tests ensure both backward compatibility and correctness of the new semantics.
Are there any user-facing changes?
Yes, behaviorally (but not API-breaking):
CastExprwhen explicit field semantics are provided.There are no breaking changes to public APIs, but downstream consumers that relied on previous planner behavior (e.g. metadata stripping or cast elision) may observe differences.
LLM-generated code disclosure
This PR includes LLM-generated code and comments. All LLM-generated content has been manually reviewed and tested.