Skip to content

feat: authorize and enable writes to data connections#162

Merged
alukach merged 15 commits into
mainfrom
worktree-authenticated-writes
Jun 26, 2026
Merged

feat: authorize and enable writes to data connections#162
alukach merged 15 commits into
mainfrom
worktree-authenticated-writes

Conversation

@alukach

@alukach alukach commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

What

Enables authenticated writes through the proxy and gates them so only callers with write permission can edit. Builds on #147 (per-connection backend OIDC federation, which added the signing path but left writes 405'd and unauthorized) and on multistore 0.6.0 (batch delete, CopyObject rejection, wider write headers, and the BucketRegistry::authorize_key hook).

How

Enable writes. Remove the edge 405 short-circuit so PUT/POST/DELETE flow through the gateway, where #147's backend-auth middleware signs them. CORS now advertises the write methods.

Authorize writes. The registry gates every write in resolve_product, after the subject-scoped product + connection fetches (so a caller who can't even see the resource is already 404/403'd). A write is denied unless all hold:

  • the caller is authenticated (anonymous → denied; no subject to authorize with);
  • the data connection is not read-only;
  • the connection can sign writes (an s3_web_identity_role);
  • the caller holds the product's write permission (Source API /permissions).

The connection-level checks run before the permissions fetch, so a write the connection can't accept skips the API call. is_write_action — in the wasm-free, natively-tested authz module — classifies the operation (a denylist over reads, fail-safe by direction); the rest of the gate is inline in the registry.

Where it runs. get_bucket threads whether the op is a write into resolve_product. DataConnection gains a required read_only field (absent fails the fetch rather than defaulting to writable). get_or_fetch_permissions fetches the caller's product permissions (60s TTL, per-subject cache key).

Batch delete. SourceCoopRegistry::authorize_key is overridden to permit keys (product-level authz already ran in get_bucket); multistore's MappedRegistry forwards authorize_key so the override is honored (it was previously shadowed by the deny-all default).

Authorization model

Signing is a property of the connection; authorization is a property of the caller + operation — orthogonal, so a public product can sit on a private bucket the proxy signs into. This is the account-permission half of ADR-005 (/permissions), with the Role-ceiling half deferred to the Role system (#126); authorize_key is the seam where per-prefix ceiling enforcement will land.

Dependency

multistore* crates are pinned to the published 0.6.0 release on crates.io (batch delete + authorize_key/MappedRegistry forward shipped there). No git/branch deps remain.

Test plan

  • cargo check --target wasm32-unknown-unknown — clean
  • cargo clippy --target wasm32-unknown-unknown — clean
  • cargo fmt --check — clean
  • cargo testauthz (2), backend_auth (10), routing (12), pagination (13)
  • Integration (live API): test_anonymous_write_denied confirms an anonymous write to a public product is denied (403) at the gate; full CI green
  • Positive end-to-end — an authorized, signed write against a connection with a configured web-identity role (needs infra; not exercisable in CI)

Deployment prerequisite

Writes to a connection require it to be configured with authentication.type = s3_web_identity_role and a role_arn whose IAM trust accepts the proxy's OIDC provider (aud = sts.amazonaws.com, sub = scv1:conn:{id}). Connections without a role keep working read-only/anonymous.

🤖 Generated with Claude Code

@alukach

alukach commented Jun 24, 2026

Copy link
Copy Markdown
Contributor Author

Alignment with the Role direction (#126 / ADR-004, ADR-005, ADR-006)

Cross-checked this PR against ADR-004 (inbound auth / Roles), ADR-005 (authorization model) and ADR-006 (outbound storage). It's a forward-compatible increment toward that target, scoped to the outbound write path + the request-time authorization gate — complementary to #126, which builds the inbound Role system.

Outbound writes ↔ ADR-006. ADR-006's preferred model is OIDC token issuance: the proxy mints short-lived JWTs and exchanges them at the cloud's STS, no stored secrets. That's exactly what this PR wires (AwsBackendAuthAssumeRoleWithWebIdentity), using the connection's s3_web_identity_role (ADR-006 "direct federation"). Not yet handled: the s3_access_key stored-credential fallback (ADR-006) for backends without OIDC federation (R2/B2/…); those connections currently fall back to anonymous and reject writes.

Authorization ↔ ADR-005. ADR-005 defines effective permission as Role ceiling (from token) ∩ account permissions (from API), per-request, with prefix enforcement. This PR implements the account-permission half — the /permissions write check + read-only/anonymous gates in get_bucket — which is the whole story for the _default Role (ceiling = *). The ceiling half is future work landing with #126.

The batch-delete authorize_key change is the ADR-005 seam. ADR-005 step 6 (per-key prefix enforcement) and step 2 (local ceiling check from the SessionToken) require request-time per-key authz driven by the registry, not by static STS scopes baked into the credential. The multistore BucketRegistry::authorize_key hook added here is precisely that seam: today it returns true (defer to the product-level account-permission check, matching the unlimited _default ceiling); when scoped Roles land, the caller's ceiling travels on identity and this method enforces the Role's resource prefixes against the key.

Re: "does the _default role just need scopes?" No — ADR-005 makes the intended _default ceiling ["*"] (unlimited), with the real constraint being the dynamic account-permission lookup (intersection at request time). Static per-bucket STS scopes can't model that (buckets are dynamic/unbounded, the token is minted before the target is known, permission changes must propagate live). So the fix is request-time API-driven authz, which is what the authorize_key delegation enables — not pre-baked scopes.

Out of scope here (owned by #126): the inbound /.sts overhaul to POST /.sts/assume-role-with-web-identity with account-owned Role URNs (sc::{account}::role/{name}). This PR leaves the existing single-_default-role StsCredentialRegistry untouched.

Writes were short-circuited with a 405 and the registry only authorized
reads. Remove the short-circuit so PUT/POST/DELETE flow through the
gateway (where the backend-auth middleware signs them), and add a
caller-side authorization layer:

- `authz`: a wasm-free, natively-tested module. `is_write_action`
  classifies the operation; `authorize_write` permits a write only when
  the connection is not read-only, the connection can sign (an S3
  web-identity role), and the caller holds the product's `write`
  permission. Anonymous callers are rejected in the registry.
- registry: `get_bucket` threads whether the op is a write into
  `resolve_product`, which runs the gate after the subject-scoped product
  and connection fetches. `DataConnection` gains `read_only`
  (fail-closed). `authorize_key` is overridden so batch-delete keys are
  permitted (product-level authz already ran in `get_bucket`).
- cache: `get_or_fetch_permissions` fetches the caller's product
  permissions (60s TTL).
- CORS advertises PUT/POST/DELETE.

Tracks the multistore data-edit-operations branch, which adds batch
delete and the `BucketRegistry::authorize_key` hook (forwarded by
`MappedRegistry`).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@alukach alukach force-pushed the worktree-authenticated-writes branch from 90baf26 to 03dca48 Compare June 24, 2026 05:16
@claude

claude Bot commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

Claude finished @alukach's task in 2m 10s —— View job


✅ No blocking issues — safe to merge.

  • Write gate logic is correct. The four-condition check in registry.rs:164–196 (subject present → connection not read-only → connection signable → caller holds "write" permission) is the right order: connection-level denials short-circuit before the permissions API call, and the read_only field is required with no serde default so a missing flag fails-closed rather than defaulting to writable.

  • is_write_action denylist is fail-safe. src/authz.rs:14 uses a denylist over reads — any unknown future Action variant is treated as a write, never granted as a read. Correct direction.

  • authorize_key correctness depends on get_bucket running first. registry.rs:87–113 returns is_write_action(action) without re-checking caller authorization. The comment documents this dependency explicitly. No issue today since multistore only calls authorize_key during batch delete (after get_bucket), but if a future multistore version calls it on a cold path the key would be allowed without an authz check. Documented, acceptable.

  • Cargo.toml [patch.crates-io] still points to a git branch. Cargo.toml:72–76 overrides all multistore crates to branch = "fix/decode-aws-chunked-writes" (commit 71b1513 per Cargo.lock). The Cargo.lock pins the commit so builds are reproducible. The TEMPORARY comment accurately explains why (aws-chunked write-decode fix not yet released upstream). However the PR description says "No git/branch deps remain" — that claim is stale; the two most recent commits (68b9187, 2f16482) explicitly added this patch after that was written. No blocking concern, but the description should be updated before merge to avoid confusion.

  • CORS expansion is safe. lib.rs:480 now advertises PUT, POST, DELETE globally. CORS is a browser-side restriction; server-side authz still gates all writes. No security regression.

  • Preview vars refactor (wrangler.preview.toml, preview.yml) is clean and self-contained.

@alukach alukach changed the title feat: authenticated writes to upstream data connections feat: authorize and enable writes to data connections Jun 24, 2026
…ites

Check the connection-level write preconditions (not read-only, signable)
before the per-caller permissions API call, so a write to a read-only or
unsignable connection is denied without the round-trip. Splits the gate
into two pure predicates (connection_accepts_writes, permits_write).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions

github-actions Bot commented Jun 24, 2026

Copy link
Copy Markdown

🚀 Latest commit deployed to https://source-data-proxy-pr-162.source-coop.workers.dev

  • Date: 2026-06-25T23:00:00Z
  • Commit: 82819dc

Drop the connection_accepts_writes / permits_write wrappers (one-line
boolean / .any checks) and inline them in resolve_product. Make
DataConnection.read_only a required field instead of a fail-closed default
helper. authz keeps only is_write_action (the one classification with
branching logic worth a native test).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The proxy no longer blanket-405s writes; they route through the gateway
and are authorized there. Assert the new contract: an unauthenticated
write to a real product is denied with 403 at the gate (before any
backend call), instead of the obsolete 405.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Defense-in-depth from auto-review: return is_write_action(action) instead
of a blanket true. No behavior change for batch delete (DeleteObject is a
write), but it never blanket-allows a read should multistore ever call
authorize_key on a non-write path.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@alukach

alukach commented Jun 24, 2026

Copy link
Copy Markdown
Contributor Author

Thanks — dispositions on the three findings:

1. read_only required field / deployment ordering — confirmed low risk: read_only is a required z.boolean() (not .optional()) in source.coop's deployed data-connection schema, so the API already returns it on every connection. The fail-closed (required, no serde default) choice stands. No change.

2. authorize_key ignoring action — adopted (ff5d081). It now returns is_write_action(action) instead of a blanket true. No behavior change for batch delete (DeleteObject is a write), but it never blanket-allows a read if a future multistore version ever calls authorize_key on a non-write path. Cheap defense-in-depth — good call.

3. PATCH no longer 405'd at the edge — conscious decision to leave as-is. PATCH isn't an S3 verb, so it's deliberately absent from the CORS allow-list, and a direct PATCH reaching the gateway gets a 4xx there (no S3 op maps to it). No client sends PATCH to an S3 endpoint; not worth a special-case at the edge.

multistore 0.6.0 ships the data-edit-operations features (developmentseed/
multistore#88: batch delete, CopyObject rejection, wider write headers,
authorize_key), so repoint the multistore* git deps from the moving
worktree-data-edit-ops-plan branch to the immutable v0.6.0 tag.

0.6.0 also renames RoleConfig.required_audience -> required_audiences (Vec);
StsCredentialRegistry::new now maps its Option<String> through into_iter()
(None -> [] no restriction, Some(x) -> [x]), preserving behavior.

cf-workers 0.6.0 isn't on crates.io yet, so all crates stay git-pinned to the
tag (single-sourced); switch to crates.io version deps once it publishes.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…pendency

Addresses auto-review notes: document why DATA_CONNECTION_CACHE_SECS is
longer than the permission TTL despite gating writes (read-path cost; the
urgent revocation case rides the 60s permission TTL), and that authorize_key
correctness depends on a prior successful get_bucket for the same name.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@alukach

alukach commented Jun 25, 2026

Copy link
Copy Markdown
Contributor Author

Thanks — addressed all four in c75bdcb:

  1. read_only TTL asymmetry (cache.rs) — documented the deliberate choice rather than shortening the TTL. DATA_CONNECTION_CACHE_SECS is fetched on every request (read + write), so a short TTL taxes the read path; flipping a connection read-only freezes all writers and is a deliberate admin act where ~5 min lag is acceptable. The urgent case — revoking a single (e.g. compromised) account's write grant — rides the 60s permission TTL, not this one, so the security argument is already covered there.

  2. authorize_key get_bucket dependency (registry.rs) — comment now states explicitly that correctness depends on a prior successful get_bucket for the same name, and that if a future multistore ever invoked authorize_key without it, this gate alone would be insufficient.

  3. get_or_fetch_permissions shape — keeping Vec<String>. The deployed endpoint (permissions/route.ts) returns a bare array (NextResponse.json(permissions)), so the suggested { permissions: [...] } wrapper struct would expect an envelope the API doesn't send — it'd break the contract, not document it. The existing parse error already names the failing URL.

  4. test_anonymous_write_denied live product — keeping cholmes/admin-boundaries. Every other test in test_integration.py already depends on that product, so the whole suite shares the dependency; pinning just this one test to a synthetic product would add inconsistency without making the suite more robust (if the product vanished, the suite fails regardless).

Also note: the multistore dep is now repointed to the published 0.6.0 release (566de37), so the pre-merge feature-branch callout is resolved.

alukach and others added 2 commits June 24, 2026 23:12
All five multistore crates (including multistore-cf-workers) publish 0.6.0
on crates.io, so the git-tag pin and its single-sourcing rationale were
unnecessary. Plain version deps resolve a single multistore 0.6.0 from the
registry; no git dependency remains.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…d-writes

# Conflicts:
#	Cargo.lock
#	Cargo.toml
#	src/sts.rs
The source.coop API emits a lowercase "write" today, but comparing with
eq_ignore_ascii_case avoids silently denying a legitimately-authorized
writer if the casing ever drifts. Strictly fail-safe: only a real write
grant matches either way.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
wrangler.preview.toml had no [vars] block, so AUTH_AUDIENCE was unset and
/.sts fail-closed with 501 ("STS token exchange is not configured"). Point
preview at the staging Ory issuer and accept the staging frontend + CLI
client_ids so the token exchange works on preview branches.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Preview proxies call the staging API and AWS IAM, which trust only issuer
https://data.staging.source.coop + kid data-proxy-1 and fetch the JWKS from
that host. Signing the on-behalf-of API token / AWS federation assertion as
the per-PR workers.dev hostname (kid source-proxy-1) fails JWKS verification
(ERR_JWKS_NO_MATCHING_KEY), so authenticated calls fall back to anonymous and
403 on restricted products. Sign as the canonical staging identity instead.

Requires the `preview` GitHub environment's OIDC_PROVIDER_KEY to be the
staging signing key.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Every preview var is now static (the OIDC issuer no longer templates the
per-PR hostname), so consolidate LOG_LEVEL, SOURCE_API_URL and OIDC_PROVIDER_*
into wrangler.preview.toml [vars] beside AUTH_*, and drop the now-empty
var_overrides. AUTH_ISSUER was duplicated across both; de-duped. preview.yml
keeps only the per-PR dynamic bindings (worker_name, PUBLIC_LOG_STREAM).
Behavior-preserving: the same vars reach the worker.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Patch multistore* to developmentseed/multistore#92
(fix/decode-aws-chunked-writes) so preview/staging pick up the aws-chunked
body decode. Without it, the modern aws-cli's default trailer-checksum
uploads are stored as raw chunk framing — corrupted writes. Revert this
[patch.crates-io] block and bump the deps to a crates.io release once the
fix ships.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Re-resolve the multistore git patch to the rewritten
developmentseed/multistore#92, which now stream-re-signs aws-chunked uploads
(zero-copy) instead of buffering+decoding — relevant on the Workers memory
ceiling.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@alukach alukach enabled auto-merge (squash) June 26, 2026 04:37
@alukach alukach disabled auto-merge June 26, 2026 04:37
@alukach alukach merged commit 85972e8 into main Jun 26, 2026
13 checks passed
@alukach alukach deleted the worktree-authenticated-writes branch June 26, 2026 04:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant