Skip to content

Provide a mutation count estimation and document how mutations are calculated #12670

@evangelosdaniil

Description

@evangelosdaniil

Is your feature request related to a problem? Please describe.
The https://docs.cloud.google.com/spanner/quotas only describes how secondary indexes affect mutation counts for delete operations. For inserts and updates, the docs state that "operations count with the multiplicity of the number of columns they affect" but do not describe how secondary indexes contribute to the count.

The full formula for inserts (number of columns + sum of columns across all secondary indexes) was only confirmed informally by Google's backend team in googleapis/google-cloud-go#1721.

Additionally, when creating a new index on an existing table, we observe an immediate increase in mutation counts for writes to that table during the schema change (write-only phase), before the index is fully backfilled. This behavior and its impact on mutation budgets is not documented.

This forces teams to reverse-engineer mutation counting logic, which is fragile and breaks when the internal counting rules change.

Describe the solution you'd like

  1. A utility class in the Java client (e.g. MutationCountEstimator) that can calculate the expected mutation count for a given set of mutations before committing
    OR
  2. Complete documentation of how mutations are actually calculated for all operation types (INSERT, UPDATE, INSERT_OR_UPDATE, REPLACE, DELETE),
    including:
    - Indexes with STORING clauses
    - Computed/generated columns
    - The impact on mutation counts during index creation (write-only phase before backfill completes)

Ideally both :D

Describe alternatives you've considered

  • Reverse-engineering the count from schema metadata and secondary index definitions. This is error-prone and has caused production incidents when our
    calculation diverged from Spanner's actual count.
  • Committing a single row first, reading getMutationCount() from CommitResponse, then using that to size remaining batches.
  • Using a limit well below 80,000 (e.g. 75,000) to absorb miscalculations.

Additional context

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions