GH-48024: [C++][Python] Preserve schema metadata in RenameColumns #48669
+90
−4
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Rationale for this change
Table.rename_columns()was dropping schema metadata, which is unexpected behavior. When you rename columns, you'd expect everything else about the schema to stay the same - including any metadata attached to it.I noticed
RecordBatch.rename_columns()andSchema.WithNames()had the same problem, so I fixed those too.What changes are included in this PR?
The issue was that these methods were calling
::arrow::schema(fields)without passing along the original metadata. The fix just addsschema()->metadata()as the second argument, same pattern thatSelectColumns()andFlatten()already use.Files changed:
cpp/src/arrow/table.cc- fix for Table::RenameColumnscpp/src/arrow/record_batch.cc- fix for RecordBatch::RenameColumnscpp/src/arrow/type.cc- fix for Schema::WithNamesAre these changes tested?
Yes, added tests for each fix:
table_test.ccrecord_batch_test.cctype_test.cctest_table.py(covers both Table and RecordBatch)Are there any user-facing changes?
Yes, this is a bug fix. After this change,
rename_columns()will preserve schema metadata instead of dropping it.