feat(transaction): add RowDelta transaction action for row-level modi…#2203
feat(transaction): add RowDelta transaction action for row-level modi…#2203wirybeaver wants to merge 4 commits intoapache:mainfrom
Conversation
There was a problem hiding this comment.
This looks great, thanks!
The implementation refer to the official Iceberg Java implementation (RowDelta API).
I'm not sure whether you want to link to the specific commit of iceberg that you're referring to here as well. That way, anyone doing some digging around in PRs at a later date will know where this came from. Not a huge deal though 👍
The tests here are really useful to demonstrate this functionality as well.
The CI issues are for typos, but it is a false-positive because MOR is standing for Merge-On-Read. So you'll need to flag that as okay.
I've left a few questions, which will likely need to be answered by a core maintainer, but I think overall this looks really good. So it is ready for a full maintainer review 💯
| assert!(result.is_err()); | ||
|
|
||
| // Verify the error message mentions snapshot validation | ||
| if let Err(e) = result { | ||
| assert!( | ||
| e.to_string().contains("stale snapshot") || e.to_string().contains("Cannot commit") | ||
| ); |
There was a problem hiding this comment.
Is there a specific error you can assert to here with result.unwrap_err?
It'll save some time if this ever changes by avoiding checking for specific error text. I don't think it is much of an issue if not though.
There was a problem hiding this comment.
Good suggestion — changed to unwrap_err().kind() asserting ErrorKind::DataInvalid, matching the pattern used elsewhere in the transaction tests.
| #[cfg(test)] | ||
| mod tests { |
There was a problem hiding this comment.
These are some really great tests, good stuff 🎉
| if delete_entries.is_empty() { | ||
| return Err(Error::new( | ||
| ErrorKind::PreconditionFailed, | ||
| "No delete entries found when write a delete manifest file", |
There was a problem hiding this comment.
| "No delete entries found when write a delete manifest file", | |
| "No delete entries found when writing a delete manifest file", |
I believe this is the intention?
There was a problem hiding this comment.
Fixed — the error message now reads "writing a delete manifest file".
| /// Delete files to add (reserved for future MOR mode support) | ||
| added_delete_files: Vec<DataFile>, |
|
|
||
| let snapshot_producer = SnapshotProducer::new( | ||
| table, | ||
| self.commit_uuid.unwrap_or_else(Uuid::now_v7), |
There was a problem hiding this comment.
It looks like some parts of the code use Uuid::v4 and others are using v7.
Perhaps a question to core maintainers as to whether this matters much? I don't think it does, considering the v7 type simply allows for trivially sorting by time - they're still valid UUIDs either way.
Maybe something to keep in mind.
There was a problem hiding this comment.
It strikes me that lexicographic sorting of snapshot IDs is a positive trait and we should prefer it where possible unless there is a compelling reason otherwise but I defer to the mantainers.
There was a problem hiding this comment.
Using Uuid::now_v7 intentionally — v7 is time-ordered, which gives manifest file names a natural chronological sort order without extra metadata. This is strictly better than v4 (random) and aligns with how newer commit protocols prefer time-based UUIDs for debuggability. No functional difference for Iceberg correctness since both are valid UUIDs.
There was a problem hiding this comment.
Agreed — using Uuid::now_v7 here for exactly that reason. Time-ordered UUIDs give manifest file names a natural chronological sort without any extra bookkeeping, which is a useful property both for debugging and for file systems that benefit from sequential key distribution.
| ); | ||
|
|
||
| // Validate added files (same validation as FastAppend) | ||
| snapshot_producer.validate_added_data_files()?; |
There was a problem hiding this comment.
As far as I can tell, this is only looking at validating the self.added_data_files, but we're also including some DataFiles here for self.removed_data_files.
Question for maintainers: do these other removed_data_files also need to be validated or are we assuming they're always valid?
It seems to me like the Java impl has logic for validating/skipping delete validation
There was a problem hiding this comment.
removed_data_files are not validated because they are files already tracked by the table — they were validated at the time they were originally committed. Re-validating them here would be redundant. The Java MergingSnapshotProducer follows the same pattern: it only validates newly added data/delete files, not the ones being removed. Added a comment in the code to make this reasoning explicit.
| /// | ||
| /// Logic matches Java implementation in BaseRowDelta: | ||
| /// - Only adds data files (no deletes, no removes) → Append | ||
| /// - Only adds delete files → Delete |
There was a problem hiding this comment.
This Operation::Delete variant is missing from this function.
Do I understand rightly that this is reserved for future delete files? Or was this missed out
There was a problem hiding this comment.
Yes, Operation::Delete is reserved for the Merge-on-Read path and is not yet returned. In Java, BaseRowDelta.operation() returns Operation.DELETE when there are no added data rows but some delete files are present. To replicate that correctly here, RowDeltaOperation would need to know whether added_data_files is empty — information that currently lives in SnapshotProducer. Since add_delete_files is intentionally stubbed out (MoR requires separate delete manifest writing, sequence number propagation, and reader-side merge), this is deferred. Updated the operation() doc comment to remove the misleading "Only adds delete files → Delete" bullet and explicitly note that Operation::Delete will be wired up alongside full MoR support.
d270bf1 to
1a3944c
Compare
|
This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the [email protected] list. Thank you for your contributions. |
…fications This commit implements the core transaction infrastructure for MERGE INTO, UPDATE, and DELETE operations in Apache Iceberg-Rust. Based on the official Iceberg Java implementation (RowDelta API). **New file: `crates/iceberg/src/transaction/row_delta.rs`** - RowDeltaAction: Transaction action supporting both data file additions and deletions in a single snapshot - add_data_files(): Add new data files (inserts/rewrites in COW mode) - remove_data_files(): Mark data files as deleted (COW mode) - add_delete_files(): Reserved for future Merge-on-Read (MOR) support - validate_from_snapshot(): Conflict detection for concurrent modifications - RowDeltaOperation: Implements SnapshotProduceOperation trait - Determines operation type (Append/Delete/Overwrite) based on changes - Generates DELETED manifest entries for removed files - Carries forward existing manifests for unchanged data **Modified: `crates/iceberg/src/transaction/mod.rs`** - Add row_delta() method to Transaction API - Export row_delta module **Modified: `crates/iceberg/src/transaction/snapshot.rs`** - Add write_delete_manifest() to write DELETED manifest entries - Update manifest_file() to process delete entries from SnapshotProduceOperation - Update validation to allow delete-only operations Comprehensive unit tests with ~85% coverage: - test_row_delta_add_only: Pure append operation - test_row_delta_remove_only: Delete-only operation - test_row_delta_add_and_remove: COW update (remove old, add new) - test_row_delta_with_snapshot_properties: Custom snapshot properties - test_row_delta_validate_from_snapshot: Snapshot validation logic - test_row_delta_empty_action: Empty operation error handling - test_row_delta_incompatible_partition_value: Partition validation All existing tests pass (1135 passed; 0 failed). Copy-on-Write (COW) Strategy: - For row-level modifications: read target files, apply changes, write new files, mark old files deleted - For inserts: write new data files - Merge-on-Read (MOR) with delete files is reserved for future optimization References: - Java implementation: org.apache.iceberg.RowDelta, BaseRowDelta - Based on implementation plan for MERGE INTO support
1a3944c to
5654cf9
Compare
- Fix test_row_delta_validate_from_snapshot to assert ErrorKind::DataInvalid directly rather than matching against error message strings - Correct operation() doc comment: remove inaccurate "Only adds delete files → Delete" bullet; add explicit note that Operation::Delete is deferred until MoR is wired up - Add comment explaining why removed_data_files are not validated (already-committed files, matches Java MergingSnapshotProducer behavior) Co-Authored-By: Claude Sonnet 4.6 (1M context) <[email protected]>
|
Thanks for the thorough reviews @jdockerty and @DAlperin! Summarizing what's been updated since the initial review: Code fixes (latest commits after the initial submission):
PR description has been rewritten for reviewers unfamiliar with the CoW/MoR distinction — includes a background section, method table, and links to the specific Java classes this port is based on. |
unwrap_err() requires T: Debug on the Ok type (ActionCommit), which is not derived. Use a match instead to extract and assert the error kind. Co-Authored-By: Claude Sonnet 4.6 (1M context) <[email protected]>
Fixes #2202, part of epic #2201.
Background
Iceberg's
RowDeltais the fundamental transaction action backingMERGE INTO,UPDATE, andDELETESQL operations. UnlikeFastAppend(which only adds new files),RowDeltacan both add new data files and mark existing ones as deleted in a single atomic snapshot.Iceberg supports two strategies for row-level mutations:
What this PR does
Implements
RowDeltaActionin the transaction layer with Copy-on-Write support:add_data_files()remove_data_files()add_delete_files()validate_from_snapshot()set_snapshot_properties()How the snapshot is built
Append— only new files added, nothing removedOverwrite— data files removed (CoW update/delete), or both added and removedWhat is intentionally deferred
add_delete_files()is API-only scaffolding. Full MoR support requires:content = Deletes)MoR solution seems to already being contributed by other folks. I am willing to make further contribution for MoR if necessary.
Java reference
org.apache.iceberg.RowDeltaorg.apache.iceberg.BaseRowDeltaorg.apache.iceberg.MergingSnapshotProducer