Skip to content

[SEA-NodeJS] (2/8) napi-rs binding consumer — TS loader + build script#380

Draft
msrathore-db wants to merge 6 commits into
msrathore/sea-abstractionfrom
msrathore/sea-napi-binding
Draft

[SEA-NodeJS] (2/8) napi-rs binding consumer — TS loader + build script#380
msrathore-db wants to merge 6 commits into
msrathore/sea-abstractionfrom
msrathore/sea-napi-binding

Conversation

@msrathore-db
Copy link
Copy Markdown
Contributor

@msrathore-db msrathore-db commented May 16, 2026

Stack

Linear stack of 8 PRs landing the M0 + M1 Phase 1 SEA NodeJS work. Merge in order from base ↑ to tip. The tip branch (msrathore/sea-auth-u2m, PR #383) is the single snapshot containing everything in flight — point your test or benchmark harness at it for an end-to-end check.

Pos PR Branch Scope
1/8 #378 sea-abstraction IBackend / ISessionBackend / IOperationBackend interfaces
2/8 #380 sea-napi-binding TS loader + build script for the kernel-provided .node artifact
3/8 #377 sea-errors-logging Kernel ErrorCode → JS error-class mapping (M0 minimum)
4/8 #379 sea-auth PAT auth via useSEA: true
5/8 #382 sea-execution executeStatement + openSession (sessionConfig, initialCatalog/Schema)
6/8 #381 sea-results CloudFetch + Inline Arrow result fetching
7/8 #384 sea-operation cancel / close / finished lifecycle + INTERVAL parity + napi-relocation acceptance (absorbed sea-integration content)
8/8 #383 sea-auth-u2m ← TIP M1 Phase 1 OAuth M2M + U2M (5 review rounds, ZERO HIGH at close)

Companion kernel stack (databricks/databricks-sql-kernel): 8 PRs — root #26 (async-public-api) → #27#25#29#28#30#24#23 (tip).

Policy: new PRs always stack on the current tip. No sibling/parallel topology. No force-pushes on existing PRs unless absolutely necessary; if a PR's content is wrong, add a fix-up commit on top of the stack tip rather than rewriting history.


This PR is position 2/8.

Summary

Thin TypeScript loader for the napi-rs binding (SeaNativeLoader), generated native/sea/index.d.ts, and the build:native npm script that compiles against the kernel workspace's napi/ sub-crate.

Companion kernel PR

The Rust napi crate itself lives in the kernel repo — see databricks/databricks-sql-kernel#25 (Database / Connection / Statement / ResultStream classes) and its skeleton predecessor #27.

Test plan

  • npm run build:native produces native/sea/index.linux-x64-gnu.node
  • ✅ Loader correctly require()s the platform-specific binary

Draft until you give the go for review.

Tracking

Co-authored-by: Isaac

@github-actions
Copy link
Copy Markdown

Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase (git rebase -i main).

@msrathore-db msrathore-db force-pushed the msrathore/sea-napi-binding branch from 84d0860 to 20231c8 Compare May 17, 2026 13:22
@msrathore-db msrathore-db changed the base branch from main to msrathore/sea-abstraction May 17, 2026 13:22
@github-actions
Copy link
Copy Markdown

Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase (git rebase -i main).

@github-actions
Copy link
Copy Markdown

Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase (git rebase -i main).

msrathore-db added a commit that referenced this pull request May 24, 2026
Addresses PR #380 review findings C1, C2 (partial), C3, H1, H2, H3,
H4, H5, H6, H7, H8 (runtime guard), H9, M1, M2, M3, M4 (linguist),
M5, M6, M7, M8, L1, L2, L4.

- C1 lint: drop `.js` ext from native/sea require so eslint
  import/extensions passes; gitignore the auto-generated index.js
  and *.node artifacts; prettier-ignore the napi-rs auto-generated
  index.d.ts / index.js.
- C2 publish: whitelist native/sea/index.{js,d.ts} in .npmignore;
  declare @databricks/sea-native-linux-x64-gnu in optionalDependencies.
  The kernel-side package-name rename (drop the linux-x64-gnu
  prefix) is tracked separately as a cross-PR ask.
- C3 test wiring: move tests/native/version.test.ts ->
  tests/unit/sea/, tests/native/e2e-smoke.test.ts -> tests/e2e/sea/;
  both now picked up by the existing mocharc globs and run via the
  existing `npm test` / `npm run e2e` jobs.
- H1 noise drop: version.test.ts now asserts one meaningful semver
  check; three tautological assertions removed.
- H2 + H3 + H7 lazy load: rewrite SeaNativeLoader.ts on the
  lib/utils/lz4.ts pattern. Lazy require behind getSeaNative();
  capability-detection helper tryGetSeaNative(); structured error
  messages that classify MODULE_NOT_FOUND vs ERR_DLOPEN_FAILED and
  include platform/arch/Node-version + install hint.
- H4 supply chain: pin @napi-rs/cli to 2.18.4 in devDependencies;
  build:native switches to `npx --no-install` so the pinned local
  install is used (no per-build network fetch).
- H5 path: switch build:native default kernel path to the canonical
  `../../databricks-sql-kernel/napi-binding` (the worktree-specific
  `-sea-WT` suffix is gone).
- H6 CI safety: e2e suite hard-fails when CI=true and any required
  env var is missing; dev machines still skip.
- H8 runtime guard: loader throws a structured error on Node <18
  instead of attempting a dlopen that would fail mysteriously.
  Driver's engines.node stays >=14 — Thrift consumers on older
  Node continue to work.
- H9 + M1 + M2 types: tsconfig adds a `@sea-native` path alias to
  native/sea/index.d.ts; loader imports the real Connection /
  Statement / ConnectionOptions / ExecuteOptions / ArrowBatch /
  ArrowSchema types and re-exports them. Tests use the re-exported
  types — the inline-shape duplication across three files collapses.
- M3 README: rewrite native/sea/README.md to match kernel reality.
  The napi crate is a standalone Cargo workspace (not a pyo3
  sibling); explain the tls-rustls choice that the standalone
  workspace exists to enable.
- M4 drift detection: mark native/sea/index.d.ts as
  linguist-generated in .gitattributes so GitHub collapses it in
  diffs and excludes from blame/language stats.
- M5 artifacts: gitignore native/sea/index.js, index.node,
  index.*.node.
- M6 + M7 e2e coverage: decode the IPC payload via apache-arrow's
  tableFromIPC and assert numRows + cell value (not just shape);
  add drain-past-null idempotence and schema-before-fetch
  coverage.
- L1 build scripts: collapse build:native + build:native:debug
  into one script taking BUILD_PROFILE (defaults to --release).
- L2 / L4 / M8: drop the version() alias and the "Round 2+ will…"
  comment debt; the loader now actually delivers the value the
  prior JSDoc was only promising.

Verified on this branch: npm run lint clean (0 errors); npm run
prettier clean for PR-owned files (3 unrelated pre-existing
warnings on PR #378 territory); tsc --project tsconfig.build.json
clean; mocha on tests/unit/sea passes (4/4); mocha on
tests/e2e/sea passes against a live pecotesting warehouse (2/2,
IPC decode confirms SELECT 1 returns 1).
@github-actions
Copy link
Copy Markdown

Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase (git rebase -i main).

Creates the napi-rs binding skeleton: Cargo.toml + lib.rs + module
stubs for database/connection/statement/result/error/logger. Captures
napi-rs tokio Handle via OnceCell in runtime.rs. Single working
#[napi] fn version() proves the binding loads + executes end-to-end
in Node.

Depends on krn-async-public-api branch (path dep on kernel).

Round 2 will add open/execute/fetch methods.

Signed-off-by: Madhavendra Rathore <[email protected]>
…wired

Adds real async methods on the opaque wrappers backing M0:
- openSession (free function) with PAT → kernel Session
- Connection::execute_statement → kernel ExecutedStatement
- Statement::fetch_next_batch / schema / cancel / close → kernel ResultStream
- Arrow batches returned as IPC bytes (per Layer 2 design)
- Error mapping preserves kernel ErrorCode + SQLSTATE for TS layer
- All entry points wrapped in catch_unwind

End-to-end smoke test against pecotesting passes.
No new dependencies beyond arrow-{ipc,array,schema} + futures.
Uses kernel async public API (no block_on).

Co-authored-by: Isaac
Signed-off-by: Madhavendra Rathore <[email protected]>
…indings

Round 1 scaffold declared tracing + tracing-subscriber as deps but
never used them. Removed. Logger bridge will re-add in round 3.

Other findings from 6b3affd-2026-05-15.md reviewed:
- Finding 2 (Database::Drop unreachable in Round 1b) — obsoleted by
  Round 2 (40d0b57): database.rs no longer declares a Database struct
  or Drop impl; it is now an `open_session` free function.
- Finding 3 (empty Connection::Drop) — obsoleted by Round 2: the Drop
  impl now spawns a real fire-and-forget close on the captured tokio
  handle.

Co-authored-by: Isaac
Signed-off-by: Madhavendra Rathore <[email protected]>
Per D-006 architectural decision (Python team's workspace pattern):
all language bindings (PyO3, napi-rs) now live as workspace siblings
in the kernel repo at databricks-sql-kernel/{pyo3,napi}/.

What this commit removes from the nodejs repo:
- native/sea/Cargo.toml (path dep relocated; package now at
  databricks-sql-kernel/napi/Cargo.toml with path = "..")
- native/sea/build.rs
- native/sea/src/* (lib, runtime, database, connection, statement,
  result, error, logger, util — all 9 files)
- native/sea/package.json (the @databricks/sea-native-linux-x64-gnu
  sub-package moves to the kernel workspace too)
- native/sea/index.js (regenerated artifact)

What stays in nodejs:
- native/sea/index.d.ts — TS declarations consumed by lib/sea/ adapter
- native/sea/README.md (new) — explains the move; points readers at
  databricks-sql-kernel/napi/

What's updated:
- package.json: `build:native` and `build:native:debug` scripts now
  delegate to the kernel workspace via $DATABRICKS_SQL_KERNEL_REPO
  (defaults to ../../databricks-sql-kernel-sea-WT/napi-binding for the
  local dev worktree layout). Build copies index.node + index.d.ts
  back into native/sea/ for the loader to find.

Why workspace co-location:
- Arrow version pinning lockstep — no silent IPC version drift
- path = ".." (clean) vs ../../../../databricks-sql-kernel-sea-WT/...
- Single CI: cargo build --workspace covers kernel + pyo3 + napi
- Kernel API changes that break either binding caught at PR-review time
- Future cgo binding for Go SEA slots in as another workspace member

This branch (sea-napi-binding) is now a thin consumer of the kernel
napi crate. The actual Rust code lives at krn-napi-binding HEAD on
the kernel repo (commit debe3d7).

Signed-off-by: Madhavendra Rathore <[email protected]>
Addresses PR #380 review findings C1, C2 (partial), C3, H1, H2, H3,
H4, H5, H6, H7, H8 (runtime guard), H9, M1, M2, M3, M4 (linguist),
M5, M6, M7, M8, L1, L2, L4.

- C1 lint: drop `.js` ext from native/sea require so eslint
  import/extensions passes; gitignore the auto-generated index.js
  and *.node artifacts; prettier-ignore the napi-rs auto-generated
  index.d.ts / index.js.
- C2 publish: whitelist native/sea/index.{js,d.ts} in .npmignore;
  declare @databricks/sea-native-linux-x64-gnu in optionalDependencies.
  The kernel-side package-name rename (drop the linux-x64-gnu
  prefix) is tracked separately as a cross-PR ask.
- C3 test wiring: move tests/native/version.test.ts ->
  tests/unit/sea/, tests/native/e2e-smoke.test.ts -> tests/e2e/sea/;
  both now picked up by the existing mocharc globs and run via the
  existing `npm test` / `npm run e2e` jobs.
- H1 noise drop: version.test.ts now asserts one meaningful semver
  check; three tautological assertions removed.
- H2 + H3 + H7 lazy load: rewrite SeaNativeLoader.ts on the
  lib/utils/lz4.ts pattern. Lazy require behind getSeaNative();
  capability-detection helper tryGetSeaNative(); structured error
  messages that classify MODULE_NOT_FOUND vs ERR_DLOPEN_FAILED and
  include platform/arch/Node-version + install hint.
- H4 supply chain: pin @napi-rs/cli to 2.18.4 in devDependencies;
  build:native switches to `npx --no-install` so the pinned local
  install is used (no per-build network fetch).
- H5 path: switch build:native default kernel path to the canonical
  `../../databricks-sql-kernel/napi-binding` (the worktree-specific
  `-sea-WT` suffix is gone).
- H6 CI safety: e2e suite hard-fails when CI=true and any required
  env var is missing; dev machines still skip.
- H8 runtime guard: loader throws a structured error on Node <18
  instead of attempting a dlopen that would fail mysteriously.
  Driver's engines.node stays >=14 — Thrift consumers on older
  Node continue to work.
- H9 + M1 + M2 types: tsconfig adds a `@sea-native` path alias to
  native/sea/index.d.ts; loader imports the real Connection /
  Statement / ConnectionOptions / ExecuteOptions / ArrowBatch /
  ArrowSchema types and re-exports them. Tests use the re-exported
  types — the inline-shape duplication across three files collapses.
- M3 README: rewrite native/sea/README.md to match kernel reality.
  The napi crate is a standalone Cargo workspace (not a pyo3
  sibling); explain the tls-rustls choice that the standalone
  workspace exists to enable.
- M4 drift detection: mark native/sea/index.d.ts as
  linguist-generated in .gitattributes so GitHub collapses it in
  diffs and excludes from blame/language stats.
- M5 artifacts: gitignore native/sea/index.js, index.node,
  index.*.node.
- M6 + M7 e2e coverage: decode the IPC payload via apache-arrow's
  tableFromIPC and assert numRows + cell value (not just shape);
  add drain-past-null idempotence and schema-before-fetch
  coverage.
- L1 build scripts: collapse build:native + build:native:debug
  into one script taking BUILD_PROFILE (defaults to --release).
- L2 / L4 / M8: drop the version() alias and the "Round 2+ will…"
  comment debt; the loader now actually delivers the value the
  prior JSDoc was only promising.

Verified on this branch: npm run lint clean (0 errors); npm run
prettier clean for PR-owned files (3 unrelated pre-existing
warnings on PR #378 territory); tsc --project tsconfig.build.json
clean; mocha on tests/unit/sea passes (4/4); mocha on
tests/e2e/sea passes against a live pecotesting warehouse (2/2,
IPC decode confirms SELECT 1 returns 1).

Signed-off-by: Madhavendra Rathore <[email protected]>
@msrathore-db msrathore-db force-pushed the msrathore/sea-napi-binding branch from 6b70ce4 to 548a14b Compare May 24, 2026 23:23
Comment thread .npmignore
# selects the per-platform `.node` artifact from `@databricks/sea-native-*`
# optionalDependencies (populated when the kernel CI publishes them);
# the .d.ts is the consumer-facing type contract.
!native/sea/index.js
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

High — release-engineering landmine that crashes every consumer the moment PR 3/8+ wires SEA in.

Verified at HEAD 548a14b8:

  • git ls-tree native/sea/ shows ONLY README.md and index.d.ts — no index.js.
  • .gitignore:17 ignores native/sea/index.js (it's regenerated by npm run build:native).
  • .npmignore:10 allow-lists !native/sea/index.js to ship in the tarball.
  • package.json scripts has no prepack or prepublishOnly hook that runs build:native.
  • The loader at lib/sea/SeaNativeLoader.ts:82 does require('../../native/sea') — Node's resolver looks for index.js in that directory.

Result: any npm publish from a workspace that didn't manually run npm run build:native first ships a tarball missing the napi-rs router. Consumers get MODULE_NOT_FOUND with a misleading hint that tells them to install the optional dep (which is already installed) instead of pointing at the missing router shim.

PR #380 doesn't crash anything today because nothing calls getSeaNative(). The moment PR 3/8 wires SeaBackend.connect() to the loader, every install bricks.

  • Suggested fix: either (a) commit native/sea/index.js (analogous to the committed index.d.ts — the file is small and stable), or (b) add a prepack script that runs build:native gated on DATABRICKS_RELEASE_BUILD=1, with an explicit [ -f native/sea/index.js ] || (echo "Missing native router; run npm run build:native first" && exit 1) assertion that fails the pack if it's still missing.

Found by: security, ops, maintainability, devils-advocate (4 reviewers).


Posted by code-review-squad · /full-review

Comment thread tsconfig.json
"forceConsistentCasingInFileNames": true
"forceConsistentCasingInFileNames": true,
"baseUrl": "./",
"paths": {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Medium — published declaration files reference an unresolvable module specifier.

Verified: tsconfig.build.json extends tsconfig.json, so the paths: { "@sea-native": ["./native/sea/index.d.ts"] } mapping propagates into the production build. TypeScript does NOT rewrite path-aliased specifiers in emitted declarations.

When this loader is re-exported from the public surface (which is the goal in PR 3/8+), every downstream TS consumer of @databricks/sql will see import type { ... } from '@sea-native' in the .d.ts and fail type-check with "Cannot find module '@sea-native'".

Today it doesn't break anything because (a) only import type is used in lib/sea/SeaNativeLoader.ts:26 (erased at runtime) and (b) the loader isn't re-exported from lib/index.ts. But the bug is already in the published surface waiting to be triggered.

  • Suggested fix: Drop the paths mapping and use a relative import type { … } from '../../native/sea'. The alias is used by exactly one file — it buys nothing.

Found by: architecture, language, agent-compat, devils-advocate (4 reviewers).


Posted by code-review-squad · /full-review


function loadFailureHint(err: NodeJS.ErrnoException): string {
const platform = `${process.platform}-${process.arch}`;
const installHint = `Install the matching optional dependency (e.g. @databricks/sea-native-${platform}).`;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Medium — every customer who hits this error is sent on a wild-goose chase.

Verified: line 55 builds platform = '${process.platform}-${process.arch}'. On linux/x64 that produces linux-x64; on darwin/arm64 it produces darwin-arm64. The error message then says:

Install the matching optional dependency (e.g. @databricks/sea-native-linux-x64).

But the actual published package (per package.json:95) is @databricks/sea-native-linux-x64-**gnu**. The hint's recommended npm install will 404.

Same issue cross-platform:

  • linux/x64 → suggests @databricks/sea-native-linux-x64; actual is -linux-x64-gnu or -linux-x64-musl

  • linux/arm64 → suggests @databricks/sea-native-linux-arm64; actual is -linux-arm64-gnu

  • win32/x64 → suggests @databricks/sea-native-win32-x64; actual is -win32-x64-msvc

  • Suggested fix: either compute the napi-rs triple correctly (mirror the router's platform→triple table — easier said than done) or drop the package-name example: "Install the matching @databricks/sea-native-* optional dependency for your platform; see the README for the supported triple list."

Found by: ops, agent-compat.


Posted by code-review-squad · /full-review

Comment thread package.json
"optionalDependencies": {
"lz4": "^0.6.5"
"lz4": "^0.6.5",
"@databricks/sea-native-linux-x64-gnu": "0.1.0"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Medium — by far the highest-consensus finding; 7 reviewers flagged it.

Only @databricks/sea-native-linux-x64-gnu: 0.1.0 is in optionalDependencies. macOS-arm64, macOS-x64, linux-arm64-gnu, linux-x64-musl (Alpine), win32-x64-msvc, etc. all have zero installable binding.

Three coupled consequences:

  1. Silent degradation — on every non-linux-x64-gnu platform, the loader will hit MODULE_NOT_FOUND. Users may not realize they're stuck on Thrift.
  2. Misleading hint (see M2) — the loader tells them to install a package that doesn't exist.
  3. Supply-chain squat risk — the unlisted names (@databricks/sea-native-darwin-arm64, …-darwin-x64, etc.) are not on the npm registry. Confirm @databricks org scope is locked to Databricks publishers, OR pre-register empty placeholder packages now, so an external party can't typosquat @databricks/sea-native-darwin-arm64 before Databricks does and have it autoload via the napi-rs router for darwin-arm64 consumers.
  • Suggested fix: Decide between two coherent models:
    • (a) Intentional M0 scope — explicitly document "M0 = linux-x64-gnu only; other platforms continue using Thrift exclusively" in the README and update the loader's error message to match.
    • (b) Multi-platform support — list all planned triples in optionalDependencies (npm tolerates missing optionalDeps on the registry — they skip silently with a warning).

Whichever you pick, address the supply-chain squat question separately (lock the scope or pre-register placeholders).

Found by: security, ops, devils-advocate, performance, architecture, agent-compat, language (7 reviewers).


Posted by code-review-squad · /full-review

export type Connection = NativeConnection;
export type Statement = NativeStatement;

export interface SeaNativeBinding {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Medium — single-source-of-truth violation.

Lines 39-44 declare:

export interface SeaNativeBinding {
  version(): string;
  openSession(options: ConnectionOptions): Promise<NativeConnection>;
  Connection: typeof NativeConnection;
  Statement: typeof NativeStatement;
}

This is a manual subset of what native/sea/index.d.ts already declares. When the kernel adds a new free function (e.g. cancelStatement, future async APIs), the .d.ts regenerates but this interface stays stale until someone notices. The unchecked cast at line 82 (require('../../native/sea') as SeaNativeBinding) hides the drift.

  • Suggested fix: derive the type from the generated module instead of redeclaring it:
import type * as SeaNative from '../../native/sea';
export type SeaNativeBinding = typeof SeaNative;

Then the cast at line 82 stays automatically correct, and the README's claim that "the .d.ts is the contract" is enforced by the type system.

Found by: architecture.


Posted by code-review-squad · /full-review

return `SEA native binding failed to load on platform ${platform} / Node ${process.version}: ${err.message}`;
}

let cached: SeaNativeBinding | null | undefined;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Medium — turns every SeaBackend unit test into an integration test once SEA is wired in.

cached and cachedError are module-level let bindings (lines 65-66). getSeaNative() / tryGetSeaNative() are free functions that close over them. There's no constructor, no DI seam, no exported reset hook.

Consequence for PR 3/8: when SeaBackend.connect() stops throwing NOT_IMPLEMENTED and starts calling getSeaNative(), any unit test that touches SeaBackend will require a real .node to be loadable on the test machine — otherwise it skips. sinon.stub(SeaNativeLoader, 'getSeaNative') works once per file but the cached value persists across files in the same mocha run. Order-dependence is baked in.

The lz4 pattern this is "mirroring" gets away with module-state because the existing codebase wraps it (e.g., Lz4Util and adapters at the call sites accept the resolved module). Here SeaNativeLoader IS the wrapper, and the wrapper has no seam.

  • Suggested fix: expose the loader as a class with an injectable seam:
export class SeaNativeLoader {
  constructor(private readonly load: () => SeaNativeBinding = defaultRequire) {}
  get(): SeaNativeBinding { /* ... */ }
  tryGet(): SeaNativeBinding | undefined { /* ... */ }
}

Then SeaBackend takes loader: SeaNativeLoader via constructor (matches the existing IClientContext pattern in lib/DBSQLClient.ts). Keep getSeaNative() as a thin convenience over a process-global default instance.

Much cheaper to change now (0 callers) than after PRs 3–8 land.

Found by: architecture, devils-advocate, maintainability adjacent (3 reviewers).


Posted by code-review-squad · /full-review

ArrowBatch,
ArrowSchema,
} from '@sea-native';

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Medium — name collisions will surface as IDE autocomplete ambiguity and silent field-shape bugs once SEA is wired in.

Lines 34-37 re-export:

export type { ConnectionOptions, ExecuteOptions, ArrowBatch, ArrowSchema };
export type Connection = NativeConnection;
export type Statement = NativeStatement;

Collisions with existing public types:

  • ConnectionOptions (SEA: hostName/httpPath/token) vs ConnectionOptions from lib/contracts/IDBSQLClient.ts (Thrift: host/path/token + AuthOptions union). The token field is shared → silent shape-mismatch bugs when an IDE auto-imports the wrong one.
  • ArrowBatch (SEA: {ipcBytes: Buffer}) vs ArrowBatch from lib/result/utils.ts (Thrift: {batches: Buffer[]; rowCount: number}).

With 0 callers today this is harmless. The moment SEA is wired into DBSQLClient and these types are imported anywhere outside lib/sea/, the bare names become ambiguous in autocomplete and in code review.

  • Suggested fix: rename the SEA-side exports now, before any production consumer imports them:
export type { ConnectionOptions as SeaConnectionOptions, ExecuteOptions as SeaExecuteOptions, ArrowBatch as SeaArrowBatch, ArrowSchema as SeaArrowSchema };
export type SeaConnection = NativeConnection;
export type SeaStatement = NativeStatement;

The kernel-generated .d.ts can keep the napi-rs default names (it's the kernel's contract). The TS-side wrapper is where disambiguation should happen, and doing it before any consumer imports the bare names is a one-way door.

Found by: agent-compat, architecture, devils-advocate, language, maintainability (5 reviewers).


Posted by code-review-squad · /full-review


describe('SEA native binding — smoke test', function smoke() {
const binding = tryGetSeaNative();
if (binding === undefined) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Medium — coverage gap that masks broken-binding regressions.

Two coupled problems:

  1. tests/unit/sea/version.test.ts:20-30 — silent skip when tryGetSeaNative() returns undefined. No CI escalation. On a linux-x64-gnu CI runner where the optional dep should install, a silently broken @databricks/sea-native-linux-x64-gnu would produce a green build. Compare to tests/e2e/sea/e2e-smoke.test.ts:55-62, which DOES fail loud in CI when env vars are missing — but applies the same logic only to env vars, not to the binding.

  2. Line 33 — the only assertion is expect(binding.version()).to.match(/^\d+\.\d+\.\d+$/). No shape-check on binding.openSession, binding.Connection, binding.Statement. If the kernel renames any of these, the test still passes — green.

  • Suggested fixes:
    • Mirror the CI fail-loud logic from the e2e test: on a runner whose triple is supposed to have a published .node (e.g., process.env.CI === 'true' and process.platform === 'linux' && process.arch === 'x64'), treat binding === undefined as a hard failure.
    • Add shape checks inside tryLoad() (lib/sea/SeaNativeLoader.ts:82 — verify typeof binding.openSession === 'function', binding.Connection exists, binding.Statement exists) so kernel-side shape drift is caught.
    • Add a tests/unit/sea/loader.test.ts with proxyquire covering the loadFailureHint branches (MODULE_NOT_FOUND, ERR_DLOPEN_FAILED, generic), the Node-version gate, and caching. Pure logic tests; run everywhere regardless of binding availability.

Found by: test, devils-advocate, ops, agent-compat.


Posted by code-review-squad · /full-review


function tryLoad(): SeaNativeBinding | undefined {
const nodeMajor = detectNodeMajor();
if (Number.isFinite(nodeMajor) && nodeMajor < MIN_NODE_MAJOR) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Low — defensive guard works backwards.

if (Number.isFinite(nodeMajor) && nodeMajor < MIN_NODE_MAJOR) evaluates to false when nodeMajor === NaN, so the load proceeds. The intent reads "fail closed if we can't tell what Node we're on" but the code does the opposite.

In practice process.version is always parseable, so the practical risk is near-zero. But it's the kind of guard that future runtime quirks will surface.

Suggested fix:

if (!Number.isFinite(nodeMajor) || nodeMajor < MIN_NODE_MAJOR) {
  // ...
}

Posted by code-review-squad · /full-review

Comment thread native/sea/index.d.ts
executeStatement(sql: string, options: ExecuteOptions): Promise<Statement>
/**
* Explicit close. Marks the connection wrapper as closed so
* subsequent calls on this `Connection` return `InvalidArg`, then
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Low — agent-friendliness / signal-to-noise.

The docstring on Connection.close discusses tracing::EnteredSpan, !Send futures, napi-rs's execute_tokio_future glue rejects non-Send futures, etc. This is excellent for a Rust reviewer but counterproductive for JS consumers — an LLM agent answering "why doesn't connection.close() await?" will paste the Rust paragraph into a JS issue/Slack where it has no context.

When the kernel fixes this (via cloned span or similar), the .d.ts is regenerated and the now-stale Rust rationale silently disappears — any agent that scraped it for caching is now serving wrong advice.

  • Suggested fix: in the Rust source, move the kernel-internal reasoning into a Rust comment (// SAFETY: style) and keep the #[napi] JSDoc terse and behavior-focused:

Returns immediately after marking the wrapper closed. The server-side close runs in the background; failures are not surfaced to JS. Use Statement.close() (which awaits) if you need to observe close errors per-statement.

Same lens applies to the class-level docs on Connection and StatementArc<Mutex<Option<...>>>, result_stream_mut, ExecutedStatementHandle are all Rust-side concerns the JS audience doesn't need.


Posted by code-review-squad · /full-review

if (err.code === 'MODULE_NOT_FOUND') {
return `SEA native binding not installed for platform ${platform} on Node ${process.version}. ${installHint}`;
}
if (err.code === 'ERR_DLOPEN_FAILED') {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Low — debuggability.

When dlopen fails (binding installed but won't load), real-world causes include musl-vs-glibc, Node ABI version mismatch, architecture mismatch (x64 binary on arm64 via Rosetta). The current hint says "likely a libc or Node ABI mismatch" without:

  • Including the underlying err.message (which carries the actual dlerror string, e.g. GLIBC_2.32 not found).
  • Suggesting concrete remediation (rm -rf node_modules, install musl variant, etc.).

Suggested fix:

return `SEA native binding present but failed to dlopen on platform ${platform} / Node ${process.version}: ${err.message}. Common causes: glibc/musl mismatch (e.g., Alpine Linux), Node ABI mismatch (try \`rm -rf node_modules && npm install\`), or architecture mismatch.`;

Found by: ops, devils-advocate.


Posted by code-review-squad · /full-review

Comment thread native/sea/index.d.ts
token: string
}
/**
* Open a Databricks SQL session over PAT auth and return an opaque
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Low (forward-looking; not blocking M0).

ArrowBatch.ipcBytes is documented as "a complete Arrow IPC stream (schema header + 1 record-batch message)". But Statement.schema() already exists and returns the schema-only payload. Re-encoding the schema in every batch means:

  • Extra bytes serialized in Rust, copied across napi, held in V8 heap on every batch.
  • Consumers construct a fresh RecordBatchReader per batch (re-parses schema each time) — see e2e-smoke.test.ts:87.

For wide schemas (hundreds of columns / nested types) the schema bytes are non-trivial and pure overhead per batch.

  • Suggested for M1: have fetchNextBatch() return just the record-batch IPC message (no schema header), or expose a streaming reader handle that emits a single schema followed by record-batch messages. The schema-only schema() API is already in place to support this.

Found by: performance.


Posted by code-review-squad · /full-review

// (process.env.CI === 'true') missing secrets are fatal — a silent
// skip would let credential-rotation regressions reach prod.

const REQUIRED_ENV = [
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Low — credential-source proliferation.

This test uses DATABRICKS_PECOTESTING_SERVER_HOSTNAME / ..._HTTP_PATH / ..._TOKEN_PERSONAL. Every other e2e test uses E2E_HOST / E2E_PATH / E2E_ACCESS_TOKEN via tests/e2e/utils/config.ts.

Two separate credential sources in the same npm run e2e invocation means CI may have one set but not the other, leading to mismatched skip behavior, and dev machines need to source two different credential files.

  • Suggested fix: reuse tests/e2e/utils/config.ts (preferred — single source of truth) or document the divergence in native/sea/README.md.

Found by: test.


Posted by code-review-squad · /full-review

@vikrantpuppala
Copy link
Copy Markdown
Collaborator

Code Review Squad — Review

Score: 52/100 — HIGH RISK (driven by supply-chain readiness, not code correctness)

The loader's structural code (SeaNativeLoader.ts) faithfully mirrors the established lib/utils/lz4.ts pattern: lazy load, cached, structured errors, fail-tolerant for Thrift-only consumers. With 0 in-repo callers today, the PR can't crash anything in production yet — but the release-engineering envelope around it has several landmines that will bite the moment PR 3/8 wires SEA into DBSQLClient.

Headline issues: missing native/sea/index.js in the published tarball (gitignored + no prepack hook); the @sea-native path alias leaking into emitted .d.ts; the install-hint suggesting a package name that doesn't exist on the registry (actual name has -gnu suffix). Address H1, M1, M2 + the platform fan-out (M3) before this PR or the next one is wired into the public surface. The remainder are forward-looking design notes that are cheaper to fix now (0 callers) than after the public API is locked in.

Reviewers: 9/9 delivered (security, devils-advocate, language, test, architecture, maintainability, agent-compat, ops, performance).


Could not post inline comments for: M4, L2 — see body below.

Medium Findings

M4 — engines.node = '>=14.0.0' but loader requires Node 18+ — silent contract mismatch at install time

Medium — npm install gives no signal; users find out at runtime.

package.json:12 declares engines.node = ">=14.0.0". lib/sea/SeaNativeLoader.ts:46 defines const MIN_NODE_MAJOR = 18;. The two contracts are not aligned.

A consumer on Node 14 or 16 runs npm install @databricks/sql. npm doesn't warn (14 >= 14). The driver works fine for Thrift. The moment they try SEA — or upgrade to a future driver version where SEA is the default — they get a runtime error.

The defensible read is "SEA is opt-in; Thrift-only consumers are fine on older Node," but that contract isn't documented anywhere user-visible, and there's no install-time signal.

  • Suggested fix: one of:
    • (a) bump engines.node to >=18.0.0 if SEA is the new default backend (will break Thrift-only customers on older Node — probably not yet).
    • (b) extract MIN_NODE_MAJOR to a single source of truth (e.g., lib/version.ts already exists and is generated — could include MIN_SEA_NODE_MAJOR) and reference it from loader + README + CI matrix.
    • (c) at minimum, document the per-feature Node floor in the README.

Found by: ops, devils-advocate, architecture, maintainability (4 reviewers).

Low Findings

Low findings — click to expand

L2 — build:native script is fragile in several dimensions

Low — every developer who runs this on a non-author machine will hit one of these.

Several issues:

  1. bash -c excludes Windows contributors entirely. Use bin/build-native.js (Node script) for cross-platform.
  2. BUILD_PROFILE is misleading — it's substituted as the value of the --platform flag, NOT as a profile name. BUILD_PROFILE=debug produces napi build --platform debug (interprets debug as the target platform). The README documents only BUILD_PROFILE= (empty) for debug, but a contributor reading the var name will reach for debug.
  3. cp index.* glob is overbroad — captures any future stray files in the napi crate dir. Use explicit cp index.js index.d.ts index.*.node native/sea/.
  4. No set -euo pipefail — if cd fails (kernel checkout missing at the default path), bash continues with subsequent commands silently. Add a pre-check: [ -d "${DATABRICKS_SQL_KERNEL_REPO:-../../databricks-sql-kernel/napi-binding}/napi" ] || { echo "Set DATABRICKS_SQL_KERNEL_REPO to your kernel checkout — see native/sea/README.md"; exit 1; }.

Found by: language, maintainability, devils-advocate, ops (4 reviewers).

L7 — Buffer return on fetchNextBatch() forces a copy out of Rust on every batch (FFI hot path)

native/sea/index.d.ts:46-52fetchNextBatch() returns ArrowBatch { ipcBytes: Buffer }. napi-rs's Buffer is an owning Node-allocated buffer; every batch is memcpy'd from Rust into V8 heap.

For the result-streaming hot path this is the dominant per-batch cost beyond the schema duplication (L5). Zero-copy alternatives to consider before the surface is wired into the result layer: napi_create_external_arraybuffer with a finalizer that keeps Rust-side bytes::Bytes alive — Node observes the bytes in place, drop runs on GC.

Cheap to change now (0 callers); breaking after. Forward-looking, not blocking M0.

Found by: performance.

L8 — Per-batch promise round-trip; no async iterator / readable stream

native/sea/index.d.ts:122-128fetchNextBatch(): Promise<ArrowBatch | null> means N batches → N independent promise resolutions across the napi tokio→libuv→V8 boundary. Standard pattern, but the lower-overhead alternative is an async iterator or push-style channel (ThreadsafeFunction calling a JS callback per batch) so the producer doesn't stall waiting for JS to request the next batch.

For pipelined fetches against a fast warehouse the per-batch event-loop ping-pong dominates once the per-batch payload is small. Not blocking M0; revisit in the M1 streaming work flagged in the docstrings.

Found by: performance.

L9 — No batch-size / max-rows hint on executeStatement ExecuteOptions

native/sea/index.d.ts:6-22ExecuteOptions = { initialCatalog?, initialSchema?, sessionConfig? } doesn't surface any batch-size knob. Consumers can't trade memory for throughput. Server-side default may be fine for M0 but worth tracking — once the surface is exported it's harder to extend without a breaking option-shape change. sessionConfig is a partial escape hatch but not discoverable.

Found by: performance.

L10 — No logging/warning when binding load fails; failure is silent until first getSeaNative() call

lib/sea/SeaNativeLoader.ts:65-90lib/utils/lz4.ts at least console.warns on unknown errors. SeaNativeLoader.ts does not — it caches the error and exposes it only via getSeaNative(). So a consumer who calls tryGetSeaNative() at startup and silently gets undefined has no log line indicating why. In production, this manifests as "SEA suddenly stopped working after a deploy, no logs" — extremely hard to diagnose.

Better: emit a telemetry event tagged {backend: 'sea' | 'thrift', binding_loaded: bool, load_error_code, platform, node_version} so fleet-wide rollout visibility exists without per-customer log capture.

Found by: ops, devils-advocate.

L11 — Tests use this.pending = true; redundantly with it.skip(...); duplicated binding-absent preamble across files

tests/unit/sea/version.test.ts:23,27this.pending = true followed by it.skip(...). The it.skip() call already marks the suite as skipped; the this.pending = true line has an eslint-disable no-invalid-this because the function isn't an arrow. tests/e2e/sea/e2e-smoke.test.ts uses just it.skip — pick one style.

As tests/sea/ grows past these two files, factor the "binding absent → skip" preamble (unit:21-30 and e2e:48-53) into a tests/helpers/sea.ts (skipIfSeaUnavailable(suite)).

Found by: test, maintainability, ops, devils-advocate (4 reviewers).

L12 — cachedError lives on as module state after cached=null; pair into a discriminated union

lib/sea/SeaNativeLoader.ts:65-66cached and cachedError are separate let bindings with a 3-state implicit cache (undefined / null / value). Today's pairing is correct, but if a future refactor ever resets cached to undefined without also clearing cachedError, a subsequent success would carry a stale error around.

  • Suggested fix: discriminated union — let state: { ok: true; binding } | { ok: false; err: Error } | undefined.

Found by: language, maintainability.

L13 — linguist-generated=true for native/sea/index.d.ts hides PR diffs

.gitattributes:36 marks native/sea/index.d.ts as linguist-generated=true. This is the canonical "auto-generated" flag, but its side effect is that GitHub collapses diffs of this file by default. Since this file IS the public TS contract for the napi-rs binding, reviewers will NOT see changes to it on subsequent PRs without explicitly expanding the diff.

Consider linguist-detectable=false instead (excludes from language stats only, keeps diff visible).

Found by: devils-advocate.

L14 — No musl vs glibc detection; Alpine consumers will hit ERR_DLOPEN_FAILED with no Alpine-specific hint

Per M3, only linux-x64-gnu is in optionalDependencies. The router would (assuming napi-rs's per-triple package even installs on Alpine via os/cpu gating — depends on the per-triple package's metadata) try to load against ld-linux-x86-64.so.2, which Alpine doesn't have. Result: ERR_DLOPEN_FAILED with a generic "libc or Node ABI mismatch" hint.

  • Suggested fix: either ship a linux-x64-musl build, or detect musl at runtime (read /etc/os-release or check process.report.getReport().header.glibcVersionRuntime) and surface a precise message: "Alpine Linux detected — no musl build is available for this driver. Continue using the Thrift backend on this runtime."

Found by: devils-advocate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants