Commit d303706
authored
ci: add DBR LTS install check to catch ES-1960554-class regressions (#843)
* ci: add DBR LTS install check + fix pyarrow-compat it caught (ES-1960554)
The thrift 0.23.0 bump (PR #796, shipped in 4.2.7) broke `pip install` on
DBR LTS: thrift ships sdist-only and 0.23.0's setup.py calls sys.exit(0) on
the build-success path, killing the PEP 517 backend before pip writes
output.json. On the old setuptools shipped by DBR 14.3/15.4 LTS this is a
hard install failure (SEV0 ES-1960554); 4.2.7 was yanked and reverted (#840).
Our CI never caught it because every job installs via `poetry install` on a
modern runner -- it never does a fresh `pip install` of the built wheel on
an LTS toolchain, the real customer path that failed.
CI check
--------
Adds a PR check (gated to dependency changes) that builds the wheel and
installs it INSIDE real DBR LTS clusters via the PECO workspace Jobs API
(no PyPI publish) then runs a SELECT 1 smoke test. Matrix = supported LTS
{13.3, 14.3, 15.4, 16.4, 17.3} x install target {base, pyarrow, kernel}.
Auth is OAuth M2M as the PECO service principal throughout (driver ->
workspace API and the notebook's connector -> warehouse smoke query); a PAT
is warehouse-scoped and rejected by the workspace REST API. Older LTS ship an
SDK too old for auth_type=oauth-m2m, so the smoke harness upgrades
databricks-sdk. Per-run artifacts are cleaned up in a finally block.
Connector fix (caught by the check)
-----------------------------------
The check surfaced a real latent bug: a base install (no [pyarrow] extra)
runs against a runtime's bundled pyarrow, and on DBR 13.3/14.3 that pyarrow
predates the `promote_options` kwarg, so concat_table_chunks raised
`TypeError: concat_tables() got an unexpected keyword argument
'promote_options'` on the Arrow result path. utils.py now falls back to the
legacy `promote=True` (equivalent to promote_options="default") when the
kwarg is unsupported, with a regression test.
Validated end-to-end against the PECO workspace: green on thrift 0.22.0, and
re-widening the pin to <0.24.0 fails on 14.3+15.4 with the exact output.json
error -- a true guard, not a check that always passes.
Also adds an incident-linked comment on the thrift pin so nobody re-widens it
before the upstream fix (THRIFT-6067 / apache/thrift#3584) ships.
Co-authored-by: Isaac
Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>
* test(e2e): authenticate e2e via OAuth M2M so staging tests match DATABRICKS_USER
The e2e suite connected via a PAT (DATABRICKS_TOKEN). The Personal Staging
Location tests PUT/GET/REMOVE against stage://tmp/<DATABRICKS_USER>/..., where
DATABRICKS_USER is the PECO service principal (TEST_PECO_SP_ID). A personal
stage is identity-scoped by design (there is even a test asserting you cannot
touch another user's stage), so the connecting identity MUST equal
DATABRICKS_USER. When DATABRICKS_TOKEN authenticates as a different identity,
those tests fail with `PERMISSION_DENIED: <user> does not have access to
Personal Stage`.
Switch the e2e connection to OAuth M2M as the service principal via
credentials_provider (conftest.auth_connect_kwargs), so the connecting identity
IS the SP == DATABRICKS_USER. Falls back to the PAT when SP OAuth creds aren't
set, so local PAT runs are unaffected. Wires DATABRICKS_CLIENT_ID /
DATABRICKS_CLIENT_SECRET (TEST_PECO_SP_ID / TEST_PECO_SP_OAUTH_SECRET, already
in azure-prod) into code-coverage.yml.
Verified locally against the PECO workspace: all 9 staging_ingestion e2e tests
pass via the real M2M path (including fails_to_modify_another_staging_user,
which validates the identity scoping). Kernel e2e files are unchanged (they run
in kernel-e2e.yml, ignored by code-coverage.yml).
Co-authored-by: Isaac
Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>
* test(e2e): add databricks-sdk dev dependency for OAuth M2M auth
The e2e M2M auth (conftest.auth_connect_kwargs) imports
databricks.sdk.core.oauth_service_principal, but databricks-sdk was not a
project dependency, so `poetry install` in code-coverage.yml didn't provide
it -- every e2e connection failed with `ModuleNotFoundError: No module named
'databricks.sdk'`. Add it to the dev group (test-only; not a runtime dep of
the connector). CI's setup-poetry runs `poetry lock` before install, so the
lockfile is regenerated on the runner.
Co-authored-by: Isaac
Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>
* Revert e2e OAuth M2M; use SP PAT instead (keep DBR LTS check on M2M)
The azure-prod DATABRICKS_TOKEN is now a personal access token owned by the
PECO service principal, so its identity matches DATABRICKS_USER. That fixes the
Personal Staging Location tests (stage://tmp/<SP>/...) with a plain PAT, without
the OAuth M2M machinery -- which also broke the retry/HTTP tests, since M2M
makes a live token-endpoint call that those tests' urllib3 mocking intercepts.
Reverts the e2e auth changes (conftest.auth_connect_kwargs + the consumer call
sites + the databricks-sdk dev dep + the code-coverage.yml SP env) back to the
plain access_token path. The DBR LTS install check keeps OAuth M2M: it hits the
workspace Jobs/SCIM API (which rejects a warehouse-scoped PAT), is proven 15/15
green on M2M, and installs databricks-sdk itself in its own workflow.
Co-authored-by: Isaac
Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>
---------
Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>1 parent 815b341 commit d303706
6 files changed
Lines changed: 613 additions & 0 deletions
File tree
- .github/workflows
- scripts
- src/databricks/sql
- tests/unit
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
10 | 10 | | |
11 | 11 | | |
12 | 12 | | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
13 | 22 | | |
14 | 23 | | |
15 | 24 | | |
| |||
0 commit comments