[WIP][SPARK-56876][SQL] Add TimestampNTZNanosType and TimestampLTZNanosType#55952
Open
MaxGekk wants to merge 10 commits into
Open
[WIP][SPARK-56876][SQL] Add TimestampNTZNanosType and TimestampLTZNanosType#55952MaxGekk wants to merge 10 commits into
MaxGekk wants to merge 10 commits into
Conversation
Convert NumberFormatException from overflowing precision strings
into UNSUPPORTED_TIMESTAMP_{LTZ,NTZ}_PRECISION with the original
digit string preserved.
Co-authored-by: Isaac
The regex in nameToType already handles every valid precision for timestamp_ltz(n) / timestamp_ntz(n) and emits a precision-specific error for invalid ones, so the parallel enumeration was dead lookup. Co-authored-by: Isaac
Anchor both types to their parameterless counterparts (TimestampType and TimestampNTZType) and state plainly that no time zone is stored, replacing the ambiguous "time zone affects interpretation only" phrase that could read as if the type carried a zone tag. Co-authored-by: Isaac
Drive both timestamp_ltz and timestamp_ntz through a single loop and add coverage for malformed precision forms (negative, empty, non- numeric, uppercase) that fall through to INVALID_JSON_DATA_TYPE. Co-authored-by: Isaac
Co-authored-by: Isaac
Co-authored-by: Isaac
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
In the PR, I propose to extend the Spark SQL type system, and add new classes to Scala/Java APIs:
They are public API entry points only, and have no SQL/DDL/datasource integration in this PR.
The classes align with the SQL standard’s direction for optional feature F555, “Enhanced seconds precision”: datetime types can carry fractional seconds with precision p in the SECOND field beyond the traditional six decimal places (microseconds). Here p is restricted to 7, 8, and 9, i.e. the nanosecond-capable band (up to nine fractional digits, nanoseconds in the second field).
The logical layout documented on the classes matches this precision story: epoch microseconds plus nanoseconds within that microsecond, with a default estimated width of 10 bytes for planning (8 + 2).
Parameterless timestamp_ntz / timestamp_ltz are unchanged and remain the existing microsecond-oriented types.
Why are the changes needed?
New timestamp types are useful for Spark SQL users because they allow:
Does this PR introduce any user-facing change?
Public API adds two new types in org.apache.spark.sql.types; they cannot yet be used in DataFrames, schemas read from datasources, or SQL DDL.
How was this patch tested?
By extending DataTypeSuite (round-trip and precision bounds for the new types, including invalid precisions).
Plus SparkThrowableSuite / error-json validation if error-conditions.json is updated.
Was this patch authored or co-authored using generative AI tooling?
Generated-by: Claude Opus 4.7