Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .github/trigger_files/beam_PostCommit_Python.json
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
{
"comment": "Modify this file in a trivial way to cause this test suite to run.",
"pr": "37345",
"modification": 49
}
"modification": 50
}
13 changes: 10 additions & 3 deletions sdks/python/apache_beam/io/gcp/bigquery_file_loads.py
Original file line number Diff line number Diff line change
Expand Up @@ -171,6 +171,15 @@ def _bq_uuid(seed=None):
return str(hashlib.md5(seed.encode('utf8')).hexdigest())


def _bq_uuid_list(list):
cs = hashlib.sha256()
for item in list:
cs.update(bigquery_tools.get_hashable_destination(item).encode('utf-8'))
# separator
cs.update(b'\x00')
return cs.hexdigest()


class _ShardDestinations(beam.DoFn):
"""Adds a shard number to the key of the KV element.

Expand Down Expand Up @@ -589,9 +598,7 @@ def process(
write_disposition = 'WRITE_APPEND'
wait_for_job = False

chunk_job_name = copy_job_name_base
if len(chunks) > 1:
chunk_job_name = f"{copy_job_name_base}_{i}"
chunk_job_name = '%s_%s' % (copy_job_name_base, _bq_uuid_list(chunk))

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Use f-strings instead of % formatting for consistency with the rest of the codebase and modern Python standards (PEP 498).

Suggested change
chunk_job_name = '%s_%s' % (copy_job_name_base, _bq_uuid_list(chunk))
chunk_job_name = f"{copy_job_name_base}_{_bq_uuid_list(chunk)}"
References
  1. Use f-strings (PEP 498) for string formatting to maintain consistency and readability. (link)

@Abacn Abacn Jun 25, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to be careful here. I remember bq load job name being deterministic is intended to avoid duplicate data on retry.

@Abacn Abacn Jun 25, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is a commit causing these test failures (#38983 ?), consider revert it for the current release then fix forward, since release already cut yesterday and there is a risk potential problems not revealed if fix-forward with cherry-picks


_LOGGER.info(
"Triggering copy job %s from %s to %s (write_disposition: %s)",
Expand Down
Loading