Skip to content

perf: Cache Transaction.table_metadata between staged updates#3302

Open
rynewang wants to merge 2 commits intoapache:mainfrom
rynewang:perf/cache-transaction-table-metadata
Open

perf: Cache Transaction.table_metadata between staged updates#3302
rynewang wants to merge 2 commits intoapache:mainfrom
rynewang:perf/cache-transaction-table-metadata

Conversation

@rynewang
Copy link
Copy Markdown

Summary

Transaction.table_metadata calls update_table_metadata on every access, which replays every staged update through model_copy(deep=True). Code that reads the property repeatedly within a single logical operation — e.g. the snapshot producer's schema()/spec()/new_manifest_writer() helpers inside per-manifest loops, or the ~5 reads in _SnapshotProducer.__init__ — pays that cost each time.

Change

Cache the result, keyed on the identity of the two inputs (self._table.metadata, self._updates). Since _updates is an immutable tuple, every self._updates += (...) rebinds it to a new object, so the is check self-invalidates without any explicit cache-clearing at mutation sites (_stage, _apply, CreateTableTransaction._initial_changes, commit_transaction). Likewise the cache misses when Table.metadata is refreshed after a commit.

Behavior note

The only observable difference is that last_updated_ms on the returned metadata is now stable across repeated reads of the same logical state, instead of being re-stamped with now() on each access. The timestamp written at commit time is unaffected.

Testing

New test_transaction_table_metadata_cached asserts that 11 consecutive reads trigger exactly one update_table_metadata call, and that staging an update invalidates and recomputes once.

Relationship to #2674 / #3301

#2674 and #3301 hoist the property access out of specific loops in snapshot.py. This PR fixes the root cause, so those hoists become unnecessary (though harmless) and any future call sites don't need to remember to hoist.

Transaction.table_metadata calls update_table_metadata on every
access, which replays every staged update through
model_copy(deep=True). Code that reads the property repeatedly
within a single logical operation (e.g. the snapshot producer's
schema()/spec()/new_manifest_writer() helpers inside per-manifest
loops) pays that cost each time.

This caches the result keyed on the identity of the base metadata
and the staged-updates tuple. Since _updates is an immutable tuple,
every self._updates += (...) rebinds it to a new object, so the
identity check self-invalidates without any explicit cache-clearing
at mutation sites. The cache also invalidates when the underlying
Table.metadata is refreshed after a commit.

The only observable difference is that last_updated_ms on the
returned metadata is now stable across repeated reads of the same
logical state, instead of being re-stamped with now() on each
access. The timestamp written at commit time is unaffected.

Adds a test that asserts repeated reads compute once, and that
staging an update invalidates and recomputes.
Copy link
Copy Markdown
Member

@geruh geruh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just do this:

@property
def table_metadata(self) -> TableMetadata:
  if not self._updates:
    return self._table.metadata
  return update_table_metadata(self._table.metadata, self._updates)

Instead of introducing a cache here we could just return the metadata if no pending updates. Avoiding caching look up complexity and should be (small lol) but faster than this for a simple append. I also don't see a world where we stage updates and read frequently. The fix above will fix the normal commit case unless I'm missing something.

@rynewang
Copy link
Copy Markdown
Author

rynewang commented May 1, 2026

Hi @geruh thanks for the fast review!

On the simplified design though, the case it doesn't cover is CreateTableTransaction: _initial_changes() which seeds self._updates with ~8–10 entries (AssignUUID, UpgradeFormatVersion, AddSchema, SetCurrentSchema, AddPartitionSpec, SetDefaultSpec, AddSortOrder, SetDefaultSortOrder, SetLocation, SetProperties) right in __init__, so _updates is never empty for the snapshot producer's whole lifetime. Every table_metadata read during _summary() / _manifests() / the new_manifest_writer/spec/schema helpers would not be benefited from the cache. That's the path I was profiling — create+append in one transaction, which is where the repeated deep-copy cost actually bites.

Same story for any multi-op transaction where something stages an update before the append (e.g. set_properties then fast_append).

My use case is: create a CreateTableTransaction, add a lot of parquets then call commit; this path calls _summary and _manifests and would be hit.

Keeping a ref to the update tuple wont copy the memory, it just increments a refcount so I think it's fine?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants