Skip to content

chore(config): default the Iceberg catalog to rest#6049

Open
mengw15 wants to merge 1 commit into
apache:mainfrom
mengw15:chore/6034-iceberg-catalog-default-rest
Open

chore(config): default the Iceberg catalog to rest#6049
mengw15 wants to merge 1 commit into
apache:mainfrom
mengw15:chore/6034-iceberg-catalog-default-rest

Conversation

@mengw15

@mengw15 mengw15 commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

What changes were proposed in this PR?

Production uses the REST (Lakekeeper) Iceberg catalog, but common/config/src/main/resources/storage.conf still defaulted iceberg.catalog.type to postgres. This PR flips the default to rest so CI stays close to production. The STORAGE_ICEBERG_CATALOG_TYPE env override is untouched, and local-dev / single-node already set rest via env, so they are unaffected; postgres stays a supported type.

The amber unit job materializes workflow results into the Iceberg result store, which resolves this default catalog, so flipping it to rest alone would make the unit job reach a Lakekeeper it does not run. This PR pins the unit job to the postgres catalog it already provisions (STORAGE_ICEBERG_CATALOG_TYPE=postgres in build.yml), keeping it fast and Lakekeeper-free. The amber-integration job keeps using the default (now rest) against its provisioned Lakekeeper.

Depends on #6045 (Task 1, merged). Task 3 (make the CI integration suite mirror prod on REST while keeping postgres-catalog coverage there too) follows.

Any related issues, documentation, discussions?

Part of #6034 (Task 2 of 3). Follows #6045.

How was this PR tested?

  • CI: build / amber (unit, now pinned to postgres) and build / amber-integration (REST via Lakekeeper) exercise both catalog backends.
  • PythonWorkflowWorkerStartupConfigSpec checks the startup-config key names (type-independent), not icebergCatalogType's value, so it is unaffected by the default change.

Was this PR authored or co-authored using generative AI tooling?

Generated-by: Claude Code (claude-opus-4-8)

@github-actions

github-actions Bot commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

Automated Reviewer Suggestions

Based on the git blame history of the changed files, we recommend the following reviewers:

  • Contributors with relevant context: @Ma77Ball, @eugenegujing
    You can notify them by mentioning @Ma77Ball, @eugenegujing in a comment.

@codecov-commenter

codecov-commenter commented Jul 1, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 56.33%. Comparing base (a1a7eb0) to head (6145a2e).

Additional details and impacted files
@@             Coverage Diff              @@
##               main    #6049      +/-   ##
============================================
- Coverage     56.38%   56.33%   -0.06%     
+ Complexity     2986     2984       -2     
============================================
  Files          1129     1129              
  Lines         43794    43794              
  Branches       4743     4743              
============================================
- Hits          24693    24670      -23     
- Misses        17650    17671      +21     
- Partials       1451     1453       +2     
Flag Coverage Δ *Carryforward flag
access-control-service 70.00% <ø> (ø)
agent-service 44.59% <ø> (ø)
amber 57.25% <ø> (-0.06%) ⬇️
computing-unit-managing-service 0.00% <ø> (ø)
config-service 52.30% <ø> (ø)
file-service 62.81% <ø> (ø)
frontend 50.06% <ø> (-0.07%) ⬇️
notebook-migration-service 78.57% <ø> (ø)
pyamber 90.15% <ø> (-0.05%) ⬇️
python 90.76% <ø> (ø) Carriedforward from a1a7eb0
workflow-compiling-service 55.14% <ø> (ø)

*This pull request uses carry forward flags. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@github-actions

github-actions Bot commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

⚠️ Benchmark changes need a look

🟢 0 better · 🔴 12 worse · ⚪ 3 noise (<±5%) · 0 without baseline

Compared against main a1a7eb0 benchmarked on this same runner, so the delta is largely free of cross-runner hardware noise. The "7d avg" column still reflects the gh-pages dashboard. Treat <±5% as noise unless repeated.

Dashboard · Run

config throughput MB/s latency max Δ latest / 7d
🔴 bs=10 sw=10 sl=64 416 0.254 22,020/36,951/36,951 us 🔴 +22.0% / 🔴 +149.9%
🔴 bs=100 sw=10 sl=64 927 0.566 108,354/134,643/134,643 us 🔴 +9.8% / 🔴 +26.0%
🔴 bs=1000 sw=10 sl=64 1,089 0.665 913,016/1,000,870/1,000,870 us 🔴 +5.4% / 🟢 -7.2%
Baseline details

Latest main a1a7eb0 from same runner

config metric PR latest main 7d avg Δ latest Δ 7d
bs=10 sw=10 sl=64 throughput 416 tuples/sec 474 tuples/sec 786.27 tuples/sec -12.2% -47.1%
bs=10 sw=10 sl=64 MB/s 0.254 MB/s 0.29 MB/s 0.48 MB/s -12.4% -47.1%
bs=10 sw=10 sl=64 p50 22,020 us 19,458 us 12,495 us +13.2% +76.2%
bs=10 sw=10 sl=64 p95 36,951 us 30,297 us 14,784 us +22.0% +149.9%
bs=10 sw=10 sl=64 p99 36,951 us 30,297 us 18,468 us +22.0% +100.1%
bs=100 sw=10 sl=64 throughput 927 tuples/sec 986 tuples/sec 991.49 tuples/sec -6.0% -6.5%
bs=100 sw=10 sl=64 MB/s 0.566 MB/s 0.602 MB/s 0.605 MB/s -6.0% -6.5%
bs=100 sw=10 sl=64 p50 108,354 us 99,905 us 100,929 us +8.5% +7.4%
bs=100 sw=10 sl=64 p95 134,643 us 122,583 us 106,894 us +9.8% +26.0%
bs=100 sw=10 sl=64 p99 134,643 us 122,583 us 114,085 us +9.8% +18.0%
bs=1000 sw=10 sl=64 throughput 1,089 tuples/sec 1,116 tuples/sec 1,023 tuples/sec -2.4% +6.5%
bs=1000 sw=10 sl=64 MB/s 0.665 MB/s 0.681 MB/s 0.624 MB/s -2.3% +6.5%
bs=1000 sw=10 sl=64 p50 913,016 us 899,536 us 983,835 us +1.5% -7.2%
bs=1000 sw=10 sl=64 p95 1,000,870 us 949,314 us 1,023,777 us +5.4% -2.2%
bs=1000 sw=10 sl=64 p99 1,000,870 us 949,314 us 1,053,883 us +5.4% -5.0%
Raw CSV
config_idx,batch_size,schema_width,string_len,num_batches,total_ms,total_tuples,total_bytes,tuples_per_sec,mb_per_sec,lat_p50_us,lat_p95_us,lat_p99_us
0,10,10,64,20,480.38,200,128000,416,0.254,22019.93,36950.72,36950.72
1,100,10,64,20,2156.84,2000,1280000,927,0.566,108354.14,134642.56,134642.56
2,1000,10,64,20,18363.48,20000,12800000,1089,0.665,913015.87,1000870.49,1000870.49

@mengw15 mengw15 marked this pull request as draft July 1, 2026 21:37
@mengw15 mengw15 force-pushed the chore/6034-iceberg-catalog-default-rest branch from 722b1af to bc85e61 Compare July 1, 2026 21:42
@github-actions github-actions Bot added the ci changes related to CI label Jul 1, 2026
Production uses the REST (Lakekeeper) Iceberg catalog, but storage.conf
still defaulted iceberg.catalog.type to postgres. Flip the default to
rest so CI stays close to production; local-dev / single-node already set
rest via env, and postgres stays a supported type.

The amber unit job materializes workflow results into the Iceberg result
store, which resolves this default, so pin the unit job to the postgres
catalog it already provisions (STORAGE_ICEBERG_CATALOG_TYPE=postgres in
build.yml) instead of reaching a Lakekeeper it does not run. The
amber-integration job keeps using the default (now rest) against its
provisioned Lakekeeper.

Part of apache#6034.
@mengw15 mengw15 force-pushed the chore/6034-iceberg-catalog-default-rest branch from bc85e61 to 6145a2e Compare July 1, 2026 21:47
@mengw15 mengw15 marked this pull request as ready for review July 1, 2026 21:55
@mengw15 mengw15 requested review from Yicong-Huang and Copilot July 1, 2026 21:57

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aligns the default Iceberg catalog backend with production by switching the default in storage.conf from postgres to rest (Lakekeeper), while keeping CI’s amber unit job isolated from Lakekeeper by explicitly pinning that job to postgres via an environment override.

Changes:

  • Default storage.iceberg.catalog.type to rest in the shared storage configuration.
  • Set STORAGE_ICEBERG_CATALOG_TYPE=postgres for the amber unit-test workflow step so it continues using the provisioned postgres-backed catalog.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
common/config/src/main/resources/storage.conf Changes the default Iceberg catalog type from postgres to rest while preserving the env-var override behavior.
.github/workflows/build.yml Pins the amber unit test step to postgres to avoid requiring Lakekeeper after the default flips to rest.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

env:
AMBER_TEST_FILTER: skip-integration
# unit job uses its provisioned postgres catalog; default (rest) needs a Lakekeeper not run here
STORAGE_ICEBERG_CATALOG_TYPE: postgres

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we moved all tests needs catalog to amber-integration? Why do we still have to keep Postgres in CI?

Can we allow run unit tests without a catalog?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. #6045 moved the specs that test the catalog (IcebergDocumentSpec etc. — direct getInstance()); those are @IntegrationTEST now.

But this PR surfaced more: amber's "unit" job also runs non-unit tests — engine e2e specs (+ one scheduling) that run a full workflow and materialize results into Iceberg as a side effect, so they need the default catalog. Verified with rest/no-Lakekeeper, these fail: DataProcessingSpec, PauseSpec, ReconfigurationSpec, DefaultCostEstimatorSpec — they only passed before because the default was postgres (provisioned by the unit job).

I'll mark this draft while we settle that; the postgres pin just keeps CI green for now. Two directions:

(a) treat them as integration (@IntegrationTEST → amber-integration): + unit job truly catalog-free; − moves core engine tests to the slower job, and a scheduling spec doesn't obviously belong in "integration".
(b) decouple the result store

Either is its own refactor PR. WDYT?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. there are more cleaning to do. in this case, let's table it.

let's keep those test in amber for now, leave postgres in amber, change default to rest.

gradually we can move those tests to intergration, and clean up amber CI later.

iceberg {
catalog {
type = postgres # either hadoop, rest, or postgres
type = rest # either hadoop, rest, or postgres

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we support Hadoop?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so — hadoop maps to a HadoopCatalog on the local filesystem. But I've never used the hadoop catalog and don't have much context on it. I tried it for the unit job here, but HadoopCatalog floods commit-conflict errors under amber's concurrent result writes.

FWIW it looks like the earliest catalog type — added with the original IcebergDocument, before rest/postgres were introduced — I don't see it used in any deployment/CI. So it may be effectively legacy.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if that's not used, let's take the chance to remove it. we just officially don't support it. one less thing to support == one less thing to maintain.

@Yicong-Huang Yicong-Huang left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, see comments in line.

iceberg {
catalog {
type = postgres # either hadoop, rest, or postgres
type = rest # either hadoop, rest, or postgres

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if that's not used, let's take the chance to remove it. we just officially don't support it. one less thing to support == one less thing to maintain.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci changes related to CI common

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants