Skip to content

PostgreSQL dialect refactor - PR1 of 4 - Dialect#276

Open
webgress wants to merge 3 commits intowesm:mainfrom
webgress:pr1-branch-dialect-extraction
Open

PostgreSQL dialect refactor - PR1 of 4 - Dialect#276
webgress wants to merge 3 commits intowesm:mainfrom
webgress:pr1-branch-dialect-extraction

Conversation

@webgress
Copy link
Copy Markdown

Adding PostgreSQL as an opt-in database backend (SQLite stays the default).
Details in pg_refactor_docs/.
This is split into 4 PRs to make it easier to review:

  • PR1 — Dialect extraction (this PR): Introduces a Dialect interface and SQLiteDialect, moves all SQLite-specific SQL out of the store layer. No new dependencies, no behavior change.
  • PR2 — PostgreSQL dialect: Adds PostgreSQLDialect, pgx driver, PG schema with tsvector FTS, and dual-backend test support.
  • PR3 — PostgreSQL functional: Portable query layer (Rebind, RETURNING, parameterized FTS), PG-native schema migrations, connection pool config, and full test coverage across both backends.
  • PR4 — Data migration: migrate-db command for copying data between SQLite and PostgreSQL in either direction.

@roborev-ci
Copy link
Copy Markdown

roborev-ci bot commented Apr 17, 2026

roborev: Combined Review (7f47f5c)

Verdict: the PR is not ready to merge; it introduces 1 high-severity regression risk and 4 medium-severity portability bugs.

High

  • internal/store/store.go:401 (InitSchema())
    Dialects that return "" from SchemaFTS() never mark FTS as available, so search indexing/backfill will silently stay disabled for those backends. This breaks the intended contract for dialects where FTS lives in the main schema.
    Fix: Set FTS capability independently of the SchemaFTS() branch, e.g. by probing via the dialect after schema init.

Medium

  • internal/store/api.go:179
    SearchMessages() combines dialect-provided FTS SQL with hardcoded ? placeholders for LIMIT/OFFSET, then executes the raw SQL without rebinding. That will fail on non-SQLite dialects.
    Fix: Build the full query with ? placeholders, then run s.Rebind() on the completed SQL before execution.

  • internal/store/api.go:249
    SearchMessagesQuery() always calls FTSSearchClause(1) even when earlier filters have already consumed parameters. For dialects using numbered placeholders, the FTS predicate will bind the wrong argument when prior conditions exist.
    Fix: Pass the next real parameter index, e.g. len(args)+1.

  • internal/store/messages.go (UpsertFTS())
    UpsertFTS() passes 7 arguments to s.db.Exec(), but Dialect.FTSUpsertSQL() is documented to take 6 parameters. That placeholder/argument mismatch is likely to break PostgreSQL integration.
    Fix: Pass exactly the 6 documented parameters, and adjust the SQLite SQL to reuse the first parameter via numbered placeholders if needed.

  • internal/store/store.go (OpenReadOnly())
    OpenReadOnly() creates the dialect but does not call dialect.InitConn(db). This is harmless for SQLite but will leave read-only connections for other dialects uninitialized.
    Fix: Call dialect.InitConn(db) immediately after dialect creation and return any error.


Synthesized from 3 reviews (agents: codex, gemini | types: default, security)

webgress and others added 3 commits April 20, 2026 19:14
Introduce a Dialect interface that abstracts SQL generation and driver-
specific behavior, and wire the Store to delegate to it. SQLiteDialect
implements the existing SQLite behavior unchanged. No new dependencies.

Scope:
- Dialect interface (internal/store/dialect.go) covers Now(), Rebind(),
  InsertOrIgnore, UpdateOrIgnore, full-text search (upsert/search/delete/
  backfill/availability/clear), connection init, schema init, WAL
  checkpoint, migration probes, and driver error predicates.
- SQLiteDialect (internal/store/dialect_sqlite.go) returns the same SQL
  that was previously hardcoded.
- Store holds a Dialect field and routes calls through it; Open and
  OpenReadOnly both construct and InitConn the dialect.
- All datetime('now'), INSERT/UPDATE OR IGNORE/REPLACE, PRAGMA, FTS5,
  and isSQLiteError usages in the store package go through the dialect.
- subset.go is intentionally not migrated; it keeps using isSQLiteError
  directly and is flagged for follow-up.
- PostgreSQL URLs are rejected by Open() with a clear error.
- Design notes in pg_refactor_docs/ describe the PR1..PR4 plan.

Addressed review feedback:
- InitSchema probes FTSAvailable unconditionally so dialects without a
  separate FTS schema file (PostgreSQL) still set the availability flag.
- SearchMessages and SearchMessagesQuery Rebind the composed SQL before
  execution, and the FTS order fragment is taken from the dialect
  instead of hardcoding "rank".
- OpenReadOnly now calls dialect.InitConn.
- FTSSearchClause drops unused paramIndex; fragments use ? placeholders
  and callers Rebind the final query.
- FTSAvailable and FTSNeedsBackfill return bool; the always-nil error
  return was discarded by every caller.
- insertInChunks accepts a chunkInsert options struct instead of six
  positional parameters.
- FTSUpsertSQL docstring describes the actual SQLite 7-arg contract.
- Duplicate isSQLiteErrorMatch helper removed; dialect predicates share
  the canonical isSQLiteError helper that subset.go also uses.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…-owned FTS upsert, Rebind in insertInChunks

- FTSSearchClause() now returns (join, where, orderBy, orderArgCount).
  Callers bind the search term orderArgCount additional times so a PG
  ts_rank() placeholder in ORDER BY binds correctly after Rebind.
  SQLite returns 0 because "rank" is an implicit FTS5 column.
- mergeLabelByName rewritten in portable SQL (DELETE conflicting
  associations, then plain UPDATE). UpdateOrIgnore dropped from Dialect
  — it was a SQLite-semantic leak that PG cannot mechanically emulate.
- FTSUpsertSQL replaced by FTSUpsert(q, FTSDoc): the dialect now owns
  both the SQL and the argument shape, so SQLite's rowid duplication
  stays out of callers and PG is free to do a tsvector column update.
- chunkInsert gains a rebind field; all three callers
  (replaceMessageRecipientsTx, replaceMessageLabelsTx, AddMessageLabels)
  pass s.dialect.Rebind so composed chunk SQL is renumbered for PG.
  replaceMessageRecipientsTx takes RecipientSet to stay under the
  ≤5-positional-parameter limit.
- ensureLabelWith / mergeLabelByName no longer take a Dialect parameter
  now that the UPDATE-OR-IGNORE workaround is gone.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…helpers

- SearchMessagesQuery uses a new ftsEnabled bool (len(TextTerms) > 0)
  as the authoritative signal that FTS is active. The old ftsJoin != ""
  check skipped the relevance ORDER BY and the non-FTS fallback for
  dialects whose FTS clause needs no join (PG tsvector).
- ensureLabelWith and mergeLabelByName now take a Dialect and route
  every ?-placeholder statement through d.Rebind, so non-SQLite
  backends get numbered placeholders. The Phase 1 loop inside
  EnsureLabelsBatch uses s.dialect.Rebind for the same reason.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@wesm wesm force-pushed the pr1-branch-dialect-extraction branch from 7f47f5c to e291892 Compare April 21, 2026 00:46
@wesm
Copy link
Copy Markdown
Owner

wesm commented Apr 21, 2026

Just rebased and addressed some review findings

@roborev-ci
Copy link
Copy Markdown

roborev-ci bot commented Apr 21, 2026

roborev: Combined Review (e291892)

Verdict: Changes are not ready to merge; the dialect abstraction currently leaves multiple SQL execution paths incompatible with non-SQLite backends.

High

  • Placeholder rebinding is applied inconsistently across dialect-aware queries

    • Location: internal/store/messages.go:214, internal/store/messages.go:1157, internal/store/messages.go:1193; internal/store/messages.go (upsertMessageWith, EnsureConversation, UpsertAttachment, backfillFTSBatch); internal/store/sources.go; internal/store/sync.go
    • Problem: Several updated queries still execute with raw ? placeholders after introducing dialect-specific SQL fragments. On PostgreSQL or any non-SQLite backend, these paths will fail unless the statements are passed through Rebind().
    • Fix: Ensure every shared SQL statement is run through d.Rebind/s.dialect.Rebind before execution, or move full statement construction into dialect methods so callers never emit raw placeholders directly.
  • InsertOrIgnore() call sites also appear to skip placeholder rebinding

    • Location: internal/store/messages.go (EnsureParticipantsBatch, EnsureConversationParticipant, UpsertReaction, EnsureParticipantByPhone)
    • Problem: The abstraction rewrites the INSERT form, but the resulting SQL still contains ? placeholders and is executed without a corresponding Rebind(). That leaves these paths non-portable for backends that require positional parameters.
    • Fix: Apply Rebind() to the SQL returned from InsertOrIgnore() before executing it.

Medium

  • Chunked delete helpers may still generate non-portable IN (...) queries
    • Location: internal/store/messages.go (MarkMessagesDeletedBatch, MarkMessagesDeletedByGmailIDBatch)
    • Problem: execInChunks appears to build dynamic IN (%s) queries without a rebinding step, unlike the refactored insert chunking path. If so, these batch delete/update paths will still break on non-SQLite dialects.
    • Fix: Refactor execInChunks to accept and apply the dialect Rebind() function to the final composed query.

Synthesized from 3 reviews (agents: codex, gemini | types: default, security)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants