Skip to content

FIXES #26603: set Trino table diff catalog/schema from table FQN#26604

Open
IceS2 wants to merge 6 commits intomainfrom
task/trino-table-diff-test-missing-catalog-na-4e1e9f4c
Open

FIXES #26603: set Trino table diff catalog/schema from table FQN#26604
IceS2 wants to merge 6 commits intomainfrom
task/trino-table-diff-test-missing-catalog-na-4e1e9f4c

Conversation

@IceS2
Copy link
Contributor

@IceS2 IceS2 commented Mar 19, 2026

Fixes #26603

Summary

  • Fixes MISSING_CATALOG_NAME error when running Compare 2 tables for differences (tableDiff) on Trino/Starburst
  • TrinoConnection.get_connection_dict() returns the service-level catalog and no schema, so cross-catalog diffs fail because data_diff opens sessions against the wrong catalog
  • Adds TrinoTableParameter that overrides get_data_diff_url() to inject the table-specific catalog and schema from the FQN into the connection dict

Fixes the issue reported on 1.11.13 where a Starburst customer got:

TrinoUserError(type=USER_ERROR, name=MISSING_CATALOG_NAME,
  message="Catalog must be specified when session catalog is not set")

Test plan

  • Unit test for TrinoTableParameter.get_data_diff_url verifying catalog/schema override
  • Verified locally against Trino Docker with cross-catalog table diff (cdl → iceberg_nlm)

…iff connection

TrinoConnection.get_connection_dict() returns a dict with the service-level
catalog and no schema. When data_diff opens a session for each table in a
cross-catalog diff, both tables end up using the same (wrong) catalog and
no schema, causing MISSING_CATALOG_NAME errors.

TrinoTableParameter.get_data_diff_url() now overrides the dict's catalog
and schema with values extracted from the table FQN, so each data_diff
connection targets the correct Trino catalog.
@IceS2 IceS2 requested a review from a team as a code owner March 19, 2026 16:02
Copilot AI review requested due to automatic review settings March 19, 2026 16:02
@github-actions github-actions bot added Ingestion safe to test Add this label to run secure Github workflows on PRs labels Mar 19, 2026
@IceS2 IceS2 changed the title fix(data-quality): set Trino table diff catalog/schema from table FQN FIXES #26603: set Trino table diff catalog/schema from table FQN Mar 19, 2026
) -> Union[str, dict]:
source_url = super().get_data_diff_url(db_service, table_fqn, override_url)
if isinstance(source_url, dict):
_, catalog, schema, _ = fqn.split(table_fqn)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Edge Case: Unguarded tuple unpacking of fqn.split assumes exactly 4 parts

In TrinoTableParameter.get_data_diff_url, the destructuring _, catalog, schema, _ = fqn.split(table_fqn) will raise a ValueError if the FQN doesn't have exactly 4 parts. While table FQNs in OpenMetadata are conventionally 4-part (service.database.schema.table), fqn.split itself imposes no length constraint and simply returns however many tokens the parser finds. A malformed or truncated FQN would produce a confusing ValueError: not enough values to unpack instead of a clear error message. Other callers in the codebase (e.g., split_test_case_fqn) add explicit length checks before unpacking.

Suggested fix:

fqn_parts = fqn.split(table_fqn)
if len(fqn_parts) != 4:
    raise ValueError(
        f"Expected a 4-part table FQN (service.catalog.schema.table), got: {table_fqn}"
    )
_, catalog, schema, _ = fqn_parts

Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes Trino/Starburst cross-catalog tableDiff failures by ensuring the data_diff connection dict uses the table FQN’s catalog/schema (not the service-level catalog).

Changes:

  • Add TrinoTableParameter to override get_data_diff_url() and inject catalog/schema parsed from table FQN into the Trino connection dict.
  • Wire the new TrinoTableParameter into Trino’s ServiceSpec as the data_diff runtime parameter setter.
  • Add a unit test validating catalog/schema override behavior for Trino.

Reviewed changes

Copilot reviewed 3 out of 4 changed files in this pull request and generated 1 comment.

File Description
ingestion/tests/unit/observability/data_quality/validations/runtime_param_setter/test_base_diff_params_setter.py Adds unit test asserting Trino tableDiff URL dict is updated with catalog/schema from the table FQN.
ingestion/src/metadata/ingestion/source/database/trino/service_spec.py Registers TrinoTableParameter as the Trino data_diff parameter setter.
ingestion/src/metadata/ingestion/source/database/trino/data_diff/data_diff.py Implements TrinoTableParameter.get_data_diff_url() override to set catalog/schema from FQN when using dict-based connection config.
ingestion/src/metadata/ingestion/source/database/trino/data_diff/init.py Package init for the new Trino data_diff module.

You can also share your feedback on Copilot code review. Take the survey.

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 19, 2026 16:07
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes Trino/Starburst table diff failures (MISSING_CATALOG_NAME) by ensuring data_diff sessions use the table’s catalog/schema (from the table FQN) rather than the service-level connection defaults.

Changes:

  • Introduces TrinoTableParameter to override get_data_diff_url() and inject catalog/schema from the table FQN into the dict-based Trino connection config.
  • Wires the Trino connector spec to use the new data_diff parameter setter.
  • Adds a unit test covering catalog/schema override behavior for Trino dict URLs.

Reviewed changes

Copilot reviewed 3 out of 4 changed files in this pull request and generated 2 comments.

File Description
ingestion/src/metadata/ingestion/source/database/trino/data_diff/data_diff.py Adds TrinoTableParameter to override catalog/schema for dict-based Trino data-diff URLs.
ingestion/src/metadata/ingestion/source/database/trino/service_spec.py Registers TrinoTableParameter as the Trino connector’s data_diff handler.
ingestion/tests/unit/observability/data_quality/validations/runtime_param_setter/test_base_diff_params_setter.py Adds unit test verifying Trino dict URL has catalog/schema set from FQN.

You can also share your feedback on Copilot code review. Take the survey.

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 19, 2026 16:15
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes Trino/Starburst table diff failures (MISSING_CATALOG_NAME) by ensuring data_diff sessions use the table-specific catalog/schema parsed from the table FQN, rather than relying on the service-level Trino connection dict.

Changes:

  • Added TrinoTableParameter to override get_data_diff_url() and inject catalog/schema from the table FQN into Trino’s connection dict.
  • Registered the new Trino data-diff parameter setter in the Trino ServiceSpec.
  • Added a unit test validating catalog/schema override behavior for Trino.

Reviewed changes

Copilot reviewed 3 out of 4 changed files in this pull request and generated 1 comment.

File Description
ingestion/src/metadata/ingestion/source/database/trino/data_diff/data_diff.py Introduces TrinoTableParameter to override catalog/schema from the table FQN for data_diff.
ingestion/src/metadata/ingestion/source/database/trino/service_spec.py Wires TrinoTableParameter into the Trino connector spec via data_diff=....
ingestion/tests/unit/observability/data_quality/validations/runtime_param_setter/test_base_diff_params_setter.py Adds unit coverage for Trino table diff URL/connection dict catalog+schema override.
ingestion/src/metadata/ingestion/source/database/trino/data_diff/__init__.py Adds the package initializer for the new data_diff module.

You can also share your feedback on Copilot code review. Take the survey.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 19, 2026

🛡️ TRIVY SCAN RESULT 🛡️

Target: openmetadata-ingestion-base-slim:trivy (debian 12.13)

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: Java

Vulnerabilities (38)

Package Vulnerability ID Severity Installed Version Fixed Version
com.fasterxml.jackson.core:jackson-core CVE-2025-52999 🚨 HIGH 2.12.7 2.15.0
com.fasterxml.jackson.core:jackson-core GHSA-72hv-8253-57qq 🚨 HIGH 2.12.7 2.18.6, 2.21.1, 3.1.0
com.fasterxml.jackson.core:jackson-core CVE-2025-52999 🚨 HIGH 2.13.4 2.15.0
com.fasterxml.jackson.core:jackson-core GHSA-72hv-8253-57qq 🚨 HIGH 2.13.4 2.18.6, 2.21.1, 3.1.0
com.fasterxml.jackson.core:jackson-core GHSA-72hv-8253-57qq 🚨 HIGH 2.15.2 2.18.6, 2.21.1, 3.1.0
com.fasterxml.jackson.core:jackson-databind CVE-2022-42003 🚨 HIGH 2.12.7 2.12.7.1, 2.13.4.2
com.fasterxml.jackson.core:jackson-databind CVE-2022-42004 🚨 HIGH 2.12.7 2.12.7.1, 2.13.4
com.google.code.gson:gson CVE-2022-25647 🚨 HIGH 2.2.4 2.8.9
com.google.protobuf:protobuf-java CVE-2021-22569 🚨 HIGH 3.3.0 3.16.1, 3.18.2, 3.19.2
com.google.protobuf:protobuf-java CVE-2022-3509 🚨 HIGH 3.3.0 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2022-3510 🚨 HIGH 3.3.0 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2024-7254 🚨 HIGH 3.3.0 3.25.5, 4.27.5, 4.28.2
com.google.protobuf:protobuf-java CVE-2021-22569 🚨 HIGH 3.7.1 3.16.1, 3.18.2, 3.19.2
com.google.protobuf:protobuf-java CVE-2022-3509 🚨 HIGH 3.7.1 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2022-3510 🚨 HIGH 3.7.1 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2024-7254 🚨 HIGH 3.7.1 3.25.5, 4.27.5, 4.28.2
com.nimbusds:nimbus-jose-jwt CVE-2023-52428 🚨 HIGH 9.8.1 9.37.2
com.squareup.okhttp3:okhttp CVE-2021-0341 🚨 HIGH 3.12.12 4.9.2
commons-beanutils:commons-beanutils CVE-2025-48734 🚨 HIGH 1.9.4 1.11.0
commons-io:commons-io CVE-2024-47554 🚨 HIGH 2.8.0 2.14.0
dnsjava:dnsjava CVE-2024-25638 🚨 HIGH 2.1.7 3.6.0
io.airlift:aircompressor CVE-2025-67721 🚨 HIGH 0.27 2.0.3
io.netty:netty-codec-http2 CVE-2025-55163 🚨 HIGH 4.1.96.Final 4.2.4.Final, 4.1.124.Final
io.netty:netty-codec-http2 GHSA-xpw8-rcwv-8f8p 🚨 HIGH 4.1.96.Final 4.1.100.Final
io.netty:netty-handler CVE-2025-24970 🚨 HIGH 4.1.96.Final 4.1.118.Final
net.minidev:json-smart CVE-2021-31684 🚨 HIGH 1.3.2 1.3.3, 2.4.4
net.minidev:json-smart CVE-2023-1370 🚨 HIGH 1.3.2 2.4.9
org.apache.avro:avro CVE-2024-47561 🔥 CRITICAL 1.7.7 1.11.4
org.apache.avro:avro CVE-2023-39410 🚨 HIGH 1.7.7 1.11.3
org.apache.derby:derby CVE-2022-46337 🔥 CRITICAL 10.14.2.0 10.14.3, 10.15.2.1, 10.16.1.2, 10.17.1.0
org.apache.ivy:ivy CVE-2022-46751 🚨 HIGH 2.5.1 2.5.2
org.apache.mesos:mesos CVE-2018-1330 🚨 HIGH 1.4.3 1.6.0
org.apache.spark:spark-core_2.12 CVE-2025-54920 🚨 HIGH 3.5.6 3.5.7
org.apache.thrift:libthrift CVE-2019-0205 🚨 HIGH 0.12.0 0.13.0
org.apache.thrift:libthrift CVE-2020-13949 🚨 HIGH 0.12.0 0.14.0
org.apache.zookeeper:zookeeper CVE-2023-44981 🔥 CRITICAL 3.6.3 3.7.2, 3.8.3, 3.9.1
org.eclipse.jetty:jetty-server CVE-2024-13009 🚨 HIGH 9.4.56.v20240826 9.4.57.v20241219
org.lz4:lz4-java CVE-2025-12183 🚨 HIGH 1.8.0 1.8.1

🛡️ TRIVY SCAN RESULT 🛡️

Target: Node.js

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: Python

Vulnerabilities (13)

Package Vulnerability ID Severity Installed Version Fixed Version
apache-airflow CVE-2025-68438 🚨 HIGH 3.1.5 3.1.6
apache-airflow CVE-2025-68675 🚨 HIGH 3.1.5 3.1.6, 2.11.1
apache-airflow CVE-2026-26929 🚨 HIGH 3.1.5 3.1.8
apache-airflow CVE-2026-28779 🚨 HIGH 3.1.5 3.1.8
apache-airflow CVE-2026-30911 🚨 HIGH 3.1.5 3.1.8
cryptography CVE-2026-26007 🚨 HIGH 42.0.8 46.0.5
jaraco.context CVE-2026-23949 🚨 HIGH 6.0.1 6.1.0
pyOpenSSL CVE-2026-27459 🚨 HIGH 24.1.0 26.0.0
starlette CVE-2025-62727 🚨 HIGH 0.48.0 0.49.1
urllib3 CVE-2025-66418 🚨 HIGH 1.26.20 2.6.0
urllib3 CVE-2025-66471 🚨 HIGH 1.26.20 2.6.0
urllib3 CVE-2026-21441 🚨 HIGH 1.26.20 2.6.3
wheel CVE-2026-24049 🚨 HIGH 0.45.1 0.46.2

🛡️ TRIVY SCAN RESULT 🛡️

Target: /etc/ssl/private/ssl-cert-snakeoil.key

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/extended_sample_data.yaml

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/lineage.yaml

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_data.json

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_data.yaml

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_data_aut.yaml

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_usage.json

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_usage.yaml

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_usage_aut.yaml

No Vulnerabilities Found

@github-actions
Copy link
Contributor

github-actions bot commented Mar 19, 2026

🛡️ TRIVY SCAN RESULT 🛡️

Target: openmetadata-ingestion:trivy (debian 12.12)

Vulnerabilities (4)

Package Vulnerability ID Severity Installed Version Fixed Version
libpam-modules CVE-2025-6020 🚨 HIGH 1.5.2-6+deb12u1 1.5.2-6+deb12u2
libpam-modules-bin CVE-2025-6020 🚨 HIGH 1.5.2-6+deb12u1 1.5.2-6+deb12u2
libpam-runtime CVE-2025-6020 🚨 HIGH 1.5.2-6+deb12u1 1.5.2-6+deb12u2
libpam0g CVE-2025-6020 🚨 HIGH 1.5.2-6+deb12u1 1.5.2-6+deb12u2

🛡️ TRIVY SCAN RESULT 🛡️

Target: Java

Vulnerabilities (39)

Package Vulnerability ID Severity Installed Version Fixed Version
com.fasterxml.jackson.core:jackson-core CVE-2025-52999 🚨 HIGH 2.12.7 2.15.0
com.fasterxml.jackson.core:jackson-core GHSA-72hv-8253-57qq 🚨 HIGH 2.12.7 2.18.6, 2.21.1, 3.1.0
com.fasterxml.jackson.core:jackson-core CVE-2025-52999 🚨 HIGH 2.13.4 2.15.0
com.fasterxml.jackson.core:jackson-core GHSA-72hv-8253-57qq 🚨 HIGH 2.13.4 2.18.6, 2.21.1, 3.1.0
com.fasterxml.jackson.core:jackson-core GHSA-72hv-8253-57qq 🚨 HIGH 2.15.2 2.18.6, 2.21.1, 3.1.0
com.fasterxml.jackson.core:jackson-core GHSA-72hv-8253-57qq 🚨 HIGH 2.16.1 2.18.6, 2.21.1, 3.1.0
com.fasterxml.jackson.core:jackson-databind CVE-2022-42003 🚨 HIGH 2.12.7 2.12.7.1, 2.13.4.2
com.fasterxml.jackson.core:jackson-databind CVE-2022-42004 🚨 HIGH 2.12.7 2.12.7.1, 2.13.4
com.google.code.gson:gson CVE-2022-25647 🚨 HIGH 2.2.4 2.8.9
com.google.protobuf:protobuf-java CVE-2021-22569 🚨 HIGH 3.3.0 3.16.1, 3.18.2, 3.19.2
com.google.protobuf:protobuf-java CVE-2022-3509 🚨 HIGH 3.3.0 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2022-3510 🚨 HIGH 3.3.0 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2024-7254 🚨 HIGH 3.3.0 3.25.5, 4.27.5, 4.28.2
com.google.protobuf:protobuf-java CVE-2021-22569 🚨 HIGH 3.7.1 3.16.1, 3.18.2, 3.19.2
com.google.protobuf:protobuf-java CVE-2022-3509 🚨 HIGH 3.7.1 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2022-3510 🚨 HIGH 3.7.1 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2024-7254 🚨 HIGH 3.7.1 3.25.5, 4.27.5, 4.28.2
com.nimbusds:nimbus-jose-jwt CVE-2023-52428 🚨 HIGH 9.8.1 9.37.2
com.squareup.okhttp3:okhttp CVE-2021-0341 🚨 HIGH 3.12.12 4.9.2
commons-beanutils:commons-beanutils CVE-2025-48734 🚨 HIGH 1.9.4 1.11.0
commons-io:commons-io CVE-2024-47554 🚨 HIGH 2.8.0 2.14.0
dnsjava:dnsjava CVE-2024-25638 🚨 HIGH 2.1.7 3.6.0
io.airlift:aircompressor CVE-2025-67721 🚨 HIGH 0.27 2.0.3
io.netty:netty-codec-http2 CVE-2025-55163 🚨 HIGH 4.1.96.Final 4.2.4.Final, 4.1.124.Final
io.netty:netty-codec-http2 GHSA-xpw8-rcwv-8f8p 🚨 HIGH 4.1.96.Final 4.1.100.Final
io.netty:netty-handler CVE-2025-24970 🚨 HIGH 4.1.96.Final 4.1.118.Final
net.minidev:json-smart CVE-2021-31684 🚨 HIGH 1.3.2 1.3.3, 2.4.4
net.minidev:json-smart CVE-2023-1370 🚨 HIGH 1.3.2 2.4.9
org.apache.avro:avro CVE-2024-47561 🔥 CRITICAL 1.7.7 1.11.4
org.apache.avro:avro CVE-2023-39410 🚨 HIGH 1.7.7 1.11.3
org.apache.derby:derby CVE-2022-46337 🔥 CRITICAL 10.14.2.0 10.14.3, 10.15.2.1, 10.16.1.2, 10.17.1.0
org.apache.ivy:ivy CVE-2022-46751 🚨 HIGH 2.5.1 2.5.2
org.apache.mesos:mesos CVE-2018-1330 🚨 HIGH 1.4.3 1.6.0
org.apache.spark:spark-core_2.12 CVE-2025-54920 🚨 HIGH 3.5.6 3.5.7
org.apache.thrift:libthrift CVE-2019-0205 🚨 HIGH 0.12.0 0.13.0
org.apache.thrift:libthrift CVE-2020-13949 🚨 HIGH 0.12.0 0.14.0
org.apache.zookeeper:zookeeper CVE-2023-44981 🔥 CRITICAL 3.6.3 3.7.2, 3.8.3, 3.9.1
org.eclipse.jetty:jetty-server CVE-2024-13009 🚨 HIGH 9.4.56.v20240826 9.4.57.v20241219
org.lz4:lz4-java CVE-2025-12183 🚨 HIGH 1.8.0 1.8.1

🛡️ TRIVY SCAN RESULT 🛡️

Target: Node.js

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: Python

Vulnerabilities (33)

Package Vulnerability ID Severity Installed Version Fixed Version
Authlib CVE-2026-27962 🔥 CRITICAL 1.6.6 1.6.9
Authlib CVE-2026-28490 🚨 HIGH 1.6.6 1.6.9
Authlib CVE-2026-28498 🚨 HIGH 1.6.6 1.6.9
Authlib CVE-2026-28802 🚨 HIGH 1.6.6 1.6.7
PyJWT CVE-2026-32597 🚨 HIGH 2.10.1 2.12.0
Werkzeug CVE-2024-34069 🚨 HIGH 2.2.3 3.0.3
aiohttp CVE-2025-69223 🚨 HIGH 3.12.12 3.13.3
aiohttp CVE-2025-69223 🚨 HIGH 3.13.2 3.13.3
apache-airflow CVE-2025-68438 🚨 HIGH 3.1.5 3.1.6
apache-airflow CVE-2025-68675 🚨 HIGH 3.1.5 3.1.6, 2.11.1
apache-airflow CVE-2026-26929 🚨 HIGH 3.1.5 3.1.8
apache-airflow CVE-2026-28779 🚨 HIGH 3.1.5 3.1.8
apache-airflow CVE-2026-30911 🚨 HIGH 3.1.5 3.1.8
apache-airflow-providers-http CVE-2025-69219 🚨 HIGH 5.6.0 6.0.0
azure-core CVE-2026-21226 🚨 HIGH 1.37.0 1.38.0
cryptography CVE-2026-26007 🚨 HIGH 42.0.8 46.0.5
google-cloud-aiplatform CVE-2026-2472 🚨 HIGH 1.130.0 1.131.0
google-cloud-aiplatform CVE-2026-2473 🚨 HIGH 1.130.0 1.133.0
jaraco.context CVE-2026-23949 🚨 HIGH 5.3.0 6.1.0
jaraco.context CVE-2026-23949 🚨 HIGH 6.0.1 6.1.0
protobuf CVE-2026-0994 🚨 HIGH 4.25.8 6.33.5, 5.29.6
pyOpenSSL CVE-2026-27459 🚨 HIGH 24.1.0 26.0.0
pyasn1 CVE-2026-23490 🚨 HIGH 0.6.1 0.6.2
pyasn1 CVE-2026-30922 🚨 HIGH 0.6.1 0.6.3
python-multipart CVE-2026-24486 🚨 HIGH 0.0.20 0.0.22
ray CVE-2025-62593 🔥 CRITICAL 2.47.1 2.52.0
starlette CVE-2025-62727 🚨 HIGH 0.48.0 0.49.1
tornado CVE-2026-31958 🚨 HIGH 6.5.3 6.5.5
urllib3 CVE-2025-66418 🚨 HIGH 1.26.20 2.6.0
urllib3 CVE-2025-66471 🚨 HIGH 1.26.20 2.6.0
urllib3 CVE-2026-21441 🚨 HIGH 1.26.20 2.6.3
wheel CVE-2026-24049 🚨 HIGH 0.45.1 0.46.2
wheel CVE-2026-24049 🚨 HIGH 0.45.1 0.46.2

🛡️ TRIVY SCAN RESULT 🛡️

Target: usr/bin/docker

Vulnerabilities (4)

Package Vulnerability ID Severity Installed Version Fixed Version
stdlib CVE-2025-68121 🔥 CRITICAL v1.25.5 1.24.13, 1.25.7, 1.26.0-rc.3
stdlib CVE-2025-61726 🚨 HIGH v1.25.5 1.24.12, 1.25.6
stdlib CVE-2025-61728 🚨 HIGH v1.25.5 1.24.12, 1.25.6
stdlib CVE-2026-25679 🚨 HIGH v1.25.5 1.25.8, 1.26.1

🛡️ TRIVY SCAN RESULT 🛡️

Target: /etc/ssl/private/ssl-cert-snakeoil.key

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /home/airflow/openmetadata-airflow-apis/openmetadata_managed_apis.egg-info/PKG-INFO

No Vulnerabilities Found

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 19, 2026 17:26
@gitar-bot
Copy link

gitar-bot bot commented Mar 19, 2026

Code Review 👍 Approved with suggestions 1 resolved / 2 findings

Trino table diff now correctly sets catalog and schema from table FQN, fixing the indentation error that caused SyntaxError. Consider adding a guard for the tuple unpacking in TrinoTableParameter to handle FQNs with unexpected part counts.

💡 Edge Case: Unguarded tuple unpacking of fqn.split assumes exactly 4 parts

📄 ingestion/src/metadata/ingestion/source/database/trino/data_diff/data_diff.py:30

In TrinoTableParameter.get_data_diff_url, the destructuring _, catalog, schema, _ = fqn.split(table_fqn) will raise a ValueError if the FQN doesn't have exactly 4 parts. While table FQNs in OpenMetadata are conventionally 4-part (service.database.schema.table), fqn.split itself imposes no length constraint and simply returns however many tokens the parser finds. A malformed or truncated FQN would produce a confusing ValueError: not enough values to unpack instead of a clear error message. Other callers in the codebase (e.g., split_test_case_fqn) add explicit length checks before unpacking.

Suggested fix
fqn_parts = fqn.split(table_fqn)
if len(fqn_parts) != 4:
    raise ValueError(
        f"Expected a 4-part table FQN (service.catalog.schema.table), got: {table_fqn}"
    )
_, catalog, schema, _ = fqn_parts
✅ 1 resolved
Bug: Indentation error on test line 162 will cause SyntaxError

📄 ingestion/tests/unit/observability/data_quality/validations/runtime_param_setter/test_base_diff_params_setter.py:162
Line 162 has 3 spaces of indentation instead of 4, which will cause a SyntaxError (or IndentationError) when Python tries to parse the test file. This will prevent the entire test module from loading, meaning all tests in this file will fail to run.

161      assert isinstance(result1, dict)
162     assert isinstance(result2, dict)   # ← 3 spaces instead of 4
163      assert result1 is not service_level_dict
🤖 Prompt for agents
Code Review: Trino table diff now correctly sets catalog and schema from table FQN, fixing the indentation error that caused SyntaxError. Consider adding a guard for the tuple unpacking in TrinoTableParameter to handle FQNs with unexpected part counts.

1. 💡 Edge Case: Unguarded tuple unpacking of fqn.split assumes exactly 4 parts
   Files: ingestion/src/metadata/ingestion/source/database/trino/data_diff/data_diff.py:30

   In `TrinoTableParameter.get_data_diff_url`, the destructuring `_, catalog, schema, _ = fqn.split(table_fqn)` will raise a `ValueError` if the FQN doesn't have exactly 4 parts. While table FQNs in OpenMetadata are conventionally 4-part (`service.database.schema.table`), `fqn.split` itself imposes no length constraint and simply returns however many tokens the parser finds. A malformed or truncated FQN would produce a confusing `ValueError: not enough values to unpack` instead of a clear error message. Other callers in the codebase (e.g., `split_test_case_fqn`) add explicit length checks before unpacking.

   Suggested fix:
   fqn_parts = fqn.split(table_fqn)
   if len(fqn_parts) != 4:
       raise ValueError(
           f"Expected a 4-part table FQN (service.catalog.schema.table), got: {table_fqn}"
       )
   _, catalog, schema, _ = fqn_parts

Options

Auto-apply is off → Gitar will not commit updates to this branch.
Display: compact → Showing less information.

Comment with these commands to change:

Auto-apply Compact
gitar auto-apply:on         
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes Trino/Starburst cross-catalog table diff failures by ensuring data_diff sessions use the table-specific catalog/schema parsed from the table FQN (instead of the service-level connection defaults).

Changes:

  • Introduces TrinoTableParameter to override get_data_diff_url() and inject catalog/schema from the table FQN into Trino’s dict-style connection config.
  • Wires the new Trino data-diff parameter setter into the Trino ServiceSpec.
  • Adds a unit test to validate catalog/schema overriding and prevent unintended dict mutation across calls.

Reviewed changes

Copilot reviewed 3 out of 4 changed files in this pull request and generated no comments.

File Description
ingestion/src/metadata/ingestion/source/database/trino/data_diff/data_diff.py Adds Trino-specific table diff parameter handling to set catalog/schema from FQN when using dict connection configs.
ingestion/src/metadata/ingestion/source/database/trino/service_spec.py Registers TrinoTableParameter as the data_diff implementation for Trino services.
ingestion/tests/unit/observability/data_quality/validations/runtime_param_setter/test_base_diff_params_setter.py Adds a regression test ensuring per-table catalog/schema override and guarding against cross-call leakage.

You can also share your feedback on Copilot code review. Take the survey.

@sonarqubecloud
Copy link

@github-actions
Copy link
Contributor

🟡 Playwright Results — all passed (17 flaky)

✅ 3387 passed · ❌ 0 failed · 🟡 17 flaky · ⏭️ 183 skipped

Shard Passed Failed Flaky Skipped
✅ Shard 1 455 0 0 2
🟡 Shard 2 304 0 1 1
🟡 Shard 3 667 0 6 33
🟡 Shard 4 680 0 5 41
🟡 Shard 5 671 0 1 73
🟡 Shard 6 610 0 4 33
🟡 17 flaky test(s) (passed on retry)
  • Pages/DataContracts.spec.ts › Create Data Contract and validate for Table (shard 2, 2 retries)
  • Features/ActivityFeed.spec.ts › emoji reactions can be added when feed messages exist (shard 3, 1 retry)
  • Features/BulkEditEntity.spec.ts › Glossary (shard 3, 1 retry)
  • Features/DataQuality/TestCaseIncidentPermissions.spec.ts › User with TEST_CASE.EDIT_ALL can see edit icon on incidents (shard 3, 1 retry)
  • Features/DataQuality/TestCaseResultPermissions.spec.ts › User with only VIEW cannot PATCH results (shard 3, 1 retry)
  • Features/ImpactAnalysis.spec.ts › Verify Upstream connections (shard 3, 1 retry)
  • Features/Permissions/GlossaryPermissions.spec.ts › Team-based permissions work correctly (shard 3, 1 retry)
  • Flow/ObservabilityAlerts.spec.ts › Alert operations for a user with and without permissions (shard 4, 1 retry)
  • Flow/PersonaFlow.spec.ts › Remove users in persona should work properly (shard 4, 1 retry)
  • Pages/Customproperties-part2.spec.ts › entityReferenceList shows item count, scrollable list, no expand toggle (shard 4, 1 retry)
  • Pages/Domains.spec.ts › Create DataProducts and add remove assets (shard 4, 1 retry)
  • Pages/Domains.spec.ts › Multiple consecutive domain renames preserve all associations (shard 4, 1 retry)
  • Pages/Entity.spec.ts › Glossary Term Add, Update and Remove (shard 5, 1 retry)
  • Pages/InputOutputPorts.spec.ts › Input port button visible, output port button hidden when no assets (shard 6, 1 retry)
  • Pages/ODCSImportExport.spec.ts › Multi-object ODCS contract - object selector shows all schema objects (shard 6, 1 retry)
  • Pages/Users.spec.ts › Permissions for table details page for Data Consumer (shard 6, 1 retry)
  • VersionPages/GlossaryVersionPage.spec.ts › GlossaryTerm (shard 6, 1 retry)

📦 Download artifacts

How to debug locally
# Download playwright-test-results-<shard> artifact and unzip
npx playwright show-trace path/to/trace.zip    # view trace

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Ingestion safe to test Add this label to run secure Github workflows on PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Trino Table Diff Test Fails with MISSING_CATALOG_NAME

3 participants