Reading upper/lower bounds values with type promotions#3293
Reading upper/lower bounds values with type promotions#3293geruh merged 5 commits intoapache:mainfrom
Conversation
|
@kevinjqliu @geruh @Fokko please take a look when you can! |
geruh
left a comment
There was a problem hiding this comment.
Thanks for raising Alex. I did a quick search and the same from bytes call is in some other places like StrictMetricsEval, and somewhere in inspect would they need this as well?
|
@geruh thanks for the review! I added it into StrictMetricsEval as well. I'm torn on |
I did a brief pass earlier so I wasn't able to give much context. Inspect methods are calling This can be proved by adding an integration test like: schema = Schema(NestedField(1, "val", IntegerType(), required=False))
tbl = catalog.create_table("glue.dru", schema)
tbl.append(pa.table({"val": pa.array([10, 20, 30], type=pa.int32())}))
with tbl.update_schema() as upd:
upd.update_column("val", field_type=LongType())
tbl.inspect.entries() running this gives me: Arguably, this makes the |
|
|
||
| @from_bytes.register(LongType) | ||
| def _(_: PrimitiveType, b: bytes) -> int: | ||
| if len(b) == 4: |
There was a problem hiding this comment.
nit: just curious here when i linked the comment earlier to the java code I noticed they use < 8 any reason for the == 4 on these?
There was a problem hiding this comment.
If it's not 8 bytes, it's going to be 4 bytes. I'm not sure what kind of data would be 5-7 bytes. I changed it nonetheless just to match the Java implementation.
Part of #3270
Rationale for this change
Thanks to @ndrluis for the suggestion!
upper_boundsandlower_boundsfields are binary. The type they represent may not be the true type if a type promotion has occurred.We need to determine the type based on the actual type + the number of bytes.
Are these changes tested?
Unit test included
Are there any user-facing changes?