[SPARK-56915][SQL] Simplify MakeDate/MakeInterval codegen under ANSI mode#55940
[SPARK-56915][SQL] Simplify MakeDate/MakeInterval codegen under ANSI mode#55940gengliangwang wants to merge 2 commits into
Conversation
Stack overview (SPARK-56908 umbrella)This PR is part of a stack of 8 PRs against SPARK-56908. Order:
PRs 1-4 are linearly stacked on each other (each branch is based on the previous one). PR 5 (decimal arithmetic) is stacked on top of PR 3 (cast decimal) since it uses |
c71697c to
9ddcf71
Compare
| if (failOnError) { | ||
| val utils = classOf[DateTimeConstructorUtils].getName | ||
| nullSafeCodeGen(ctx, ev, (year, month, week, day, hour, min, sec) => { | ||
| val secFrac = sec.getOrElse("org.apache.spark.sql.types.Decimal$.MODULE$.apply(0, " + |
There was a problem hiding this comment.
secFrac fallback in ANSI codegen is opaque.
Two problems:
- Dead code in practice. MakeInterval.children always has 7 elements (all the auxiliary this(...) constructors pass Literal(0) / Literal(Decimal(...)) to fill missing slots), so SeptenaryExpression.nullSafeCodeGen always passes Some(...) to the lambda. getOrElse never fires.
- Inconsistency with the non-ANSI branch in the same file: val secFrac = sec.getOrElse("0") // non-ANSI branch. The non-ANSI branch's "0" is int-typed but IntervalUtils.makeInterval's 7th argument is Decimal. If the getOrElse ever fired, the non-ANSI path would emit … makeInterval(…, 0) which wouldn't compile, while the ANSI path would emit a valid Decimal$.MODULE$.apply(0, ...). You appear to have noticed the type issue but only fixed one branch. Both branches' fallbacks are unreachable, but it's still an inconsistency that will confuse future readers
The pre-PR code had the same "0" dead-code fallback. So the PR didn't introduce the dead code — but it did expand it into a more elaborate (still dead) form on one branch only. Cleanest fix: replace sec.getOrElse(...) with sec.get (assertive) and add a comment that MakeInterval always provides 7 children. Or stick with getOrElse but use the same string on both branches.
| /** | ||
| * Builds a day count for {@code MakeDate(year, month, day)} in ANSI mode. | ||
| * Throws a {@code SparkDateTimeException} if the arguments don't form a | ||
| * valid {@link LocalDate}. | ||
| */ |
There was a problem hiding this comment.
Tighten makeDateExact Javadoc to clarify it only catches DateTimeException, not all errors from constructing/converting a LocalDate (e.g., the ArithmeticException from localDateToDays's toIntExact is not caught).
| * the user-facing {@code ansiDateTimeArgumentOutOfRange} / | ||
| * {@code arithmeticOverflowError}. | ||
| */ | ||
| public final class DateTimeConstructorUtils { |
There was a problem hiding this comment.
If the broad name is forward-looking, document the planned scope in the PR description so the name is justified. Otherwise, rename it to match the current contents (e.g. MakeDateAndIntervalUtils).
9ddcf71 to
c13127e
Compare
|
Addressed in c13127e, thanks @viirya:
PTAL. |
| * the user-facing {@code ansiDateTimeArgumentOutOfRange} / | ||
| * {@code arithmeticOverflowError}. | ||
| */ | ||
| public final class MakeDateAndIntervalUtils { |
There was a problem hiding this comment.
There are many similar PRs ongoing, do we have a plan for adding and consolidating these util java functions? I feel the current one-expression-one-java-class approach is a bit overkill, shall we group them?
…mode ### What changes were proposed in this pull request? Introduce `DateTimeConstructorUtils.java` with two static helpers: * `makeDateExact(int year, int month, int day)`: wraps `LocalDate.of(...) + DateTimeUtils.localDateToDays(...)` in the `ansiDateTimeArgumentOutOfRange(DateTimeException)` try/catch. * `makeIntervalExact(int years, int months, int weeks, int days, int hours, int mins, Decimal secs)`: wraps `IntervalUtils.makeInterval` in the `arithmeticOverflowError(message, "", null)` try/catch. `datetimeExpressions.MakeDate` and `intervalExpressions.MakeInterval` delegate to the helpers in their `failOnError = true` codegen + eval paths. The non-ANSI branch keeps the inline `try/catch -> null` form because it sets `isNull = true` on failure. ### Why are the changes needed? Part of SPARK-56908 (umbrella). The 7-line try/catch wrapper that maps `DateTimeException` / `ArithmeticException` to the user-facing ANSI error was emitted inline in `doGenCode`; the helper collapses it to a single call per call site without losing the wrapped exception's message (no `QueryContext` argument is involved, so this PR introduces no per-row `references[]` regression). ### Does this PR introduce _any_ user-facing change? No. The compiled behavior is identical; only the emitted Java source text changes. ### How was this patch tested? ``` build/sbt "catalyst/testOnly *DateExpressionsSuite *IntervalExpressionsSuite" ``` 96/96 pass. ### Was this patch authored or co-authored using generative AI tooling? Generated-by: Cursor 1.x
c13127e to
d118961
Compare
|
Good point, thanks @cloud-fan. I followed the codebase convention: existing utility files like Plan across the SPARK-56908 stack:
The new name parallels |
|
The PR description still references DateTimeConstructorUtils.java and describes makeDateExact as wrapping LocalDate.of(...) + DateTimeUtils.localDateToDays(...) in the ansiDateTimeArgumentOutOfRange try/catch — but the final class is DateTimeExpressionUtils.java, and per the javadoc tightening in c13127e, the helper only catches the DateTimeException from LocalDate.of(...) (the ArithmeticException from localDateToDays's internal toIntExact propagates). Could you refresh the PR body so the squash commit message picks up the accurate description? Thanks! |
|
Refreshed the PR body — |
|
@cloud-fan @viirya thanks for the review. Merging to master/4.x |
…mode ### What changes were proposed in this pull request? Introduce `DateTimeExpressionUtils.java` with two static helpers: * `makeDateExact(int year, int month, int day)`: wraps `LocalDate.of(...)` in an `ansiDateTimeArgumentOutOfRange(DateTimeException)` try/catch, then calls `DateTimeUtils.localDateToDays(...)`. The `ArithmeticException` from `localDateToDays`'s internal `toIntExact` is intentionally not caught — it propagates as-is (matching the pre-PR behavior for that overflow case). * `makeIntervalExact(int years, int months, int weeks, int days, int hours, int mins, Decimal secs)`: wraps `IntervalUtils.makeInterval` in the `arithmeticOverflowError(message, "", null)` try/catch. `datetimeExpressions.MakeDate` and `intervalExpressions.MakeInterval` delegate to the helpers in their `failOnError = true` codegen + eval paths. The non-ANSI branch keeps the inline `try/catch -> null` form because it sets `isNull = true` on failure. ### Why are the changes needed? Part of SPARK-56908 (umbrella). The try/catch wrapper that maps `DateTimeException` (for `MakeDate`) / `ArithmeticException` (for `MakeInterval`) to the user-facing ANSI error was emitted inline in `doGenCode`; the helper collapses it to a single call per call site without losing the wrapped exception's message (no `QueryContext` argument is involved, so this PR introduces no per-row `references[]` regression). ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? ``` build/sbt "catalyst/testOnly *DateExpressionsSuite *IntervalExpressionsSuite" ``` 96/96 pass. ### Was this patch authored or co-authored using generative AI tooling? Generated-by: Cursor 1.x Closes #55940 from gengliangwang/SPARK-56915-make-date-interval. Authored-by: Gengliang Wang <gengliang@apache.org> Signed-off-by: Gengliang Wang <gengliang@apache.org> (cherry picked from commit dde0863) Signed-off-by: Gengliang Wang <gengliang@apache.org>
What changes were proposed in this pull request?
Introduce
DateTimeExpressionUtils.javawith two static helpers:makeDateExact(int year, int month, int day): wrapsLocalDate.of(...)in anansiDateTimeArgumentOutOfRange(DateTimeException)try/catch, then callsDateTimeUtils.localDateToDays(...). TheArithmeticExceptionfromlocalDateToDays's internaltoIntExactis intentionally not caught — it propagates as-is (matching the pre-PR behavior for that overflow case).makeIntervalExact(int years, int months, int weeks, int days, int hours, int mins, Decimal secs): wrapsIntervalUtils.makeIntervalin thearithmeticOverflowError(message, "", null)try/catch.datetimeExpressions.MakeDateandintervalExpressions.MakeIntervaldelegate to the helpers in theirfailOnError = truecodegen + eval paths. The non-ANSI branch keeps the inlinetry/catch -> nullform because it setsisNull = trueon failure.Why are the changes needed?
Part of SPARK-56908 (umbrella). The try/catch wrapper that maps
DateTimeException(forMakeDate) /ArithmeticException(forMakeInterval) to the user-facing ANSI error was emitted inline indoGenCode; the helper collapses it to a single call per call site without losing the wrapped exception's message (noQueryContextargument is involved, so this PR introduces no per-rowreferences[]regression).Does this PR introduce any user-facing change?
No.
How was this patch tested?
96/96 pass.
Was this patch authored or co-authored using generative AI tooling?
Generated-by: Cursor 1.x