feat(gfql/ir): LogicalPlan structural verifier (M2-PR3) by lmeyerov · Pull Request #1131 · graphistry/pygraphistry

lmeyerov · 2026-04-11T20:36:31Z

Closes #1127

Summary

Adds graphistry/compute/gfql/ir/verifier.py with verify(plan: LogicalPlan) -> list[CompilerError]
Tree-walks the operator DAG and checks five structural invariants:
1. op_id uniqueness — non-zero op_ids must be distinct across the whole tree (0 = unassigned, exempt)
2. Dangling references — input/left/right/subquery slots must hold LogicalPlan | None
3. Predicate scope — PatternMatch predicates must carry non-empty expression strings
4. Schema validity — RowSchema column values must be LogicalType instances
5. Optional-arm nullability — PatternMatch(optional=True) must carry a non-empty arm_id
Accumulates all errors (no short-circuit) so callers see the full violation set
Exports verify from ir/__init__.py
23 focused tests: positive (valid plans → zero errors) and negative (malformed fixtures → stable errors)

Test plan

pytest graphistry/tests/compute/gfql/test_ir_verifier.py — 23/23 pass
./bin/ruff.sh — clean
./bin/typecheck.sh graphistry/compute/gfql/ir/verifier.py — clean (mypy: no issues)
All existing IR tests still green (41 total)
CI green

Non-goals (per issue)

No cost-based optimization
No physical planning / executor routing

🤖 Generated with Claude Code

Adds `graphistry/compute/gfql/ir/verifier.py` with `verify(plan) -> list[CompilerError]` implementing five structural invariants via full tree walk: 1. op_id uniqueness (non-zero ids must be distinct) 2. Dangling reference detection (child slots must be LogicalPlan | None) 3. Predicate scope validation (PatternMatch predicates need non-empty expressions) 4. Output schema consistency (RowSchema columns must hold LogicalType instances) 5. Optional-arm nullability contract (optional PatternMatch must carry arm_id) Exports `verify` from `ir/__init__.py`. 23 focused tests (positive + negative). Co-Authored-By: Claude Sonnet 4.6 <[email protected]>

…pe, test coverage Fixes from post-implementation spec audit (Round A): - Fix _MISSING sentinel forward-ref: move above _children() where it's used - Extract _CHILD_SLOTS constant to eliminate duplicate tuple literal (DRY) - Extend predicate scope check (invariant 3) to Filter.predicate and IndexScan.predicate + residual_predicates, not just PatternMatch - Add _check_predicate() helper to deduplicate single-predicate error logic - Expand test suite: 23 -> 41 tests covering EdgeScan, Apply, SemiApply, AntiSemiApply, Unwind, IndexScan, dangling refs on left/right/subquery, Filter predicate scope, multi-predicate mixed-empty, multi-violation accumulation, PathType schema validation - Replace fragile object.__setattr__ test fixture with _inject() helper that uses dataclasses.replace() as the primary path Co-Authored-By: Claude Sonnet 4.6 <[email protected]>

…t element_type gap Co-Authored-By: Claude Sonnet 4.6 <[email protected]>

… assertions Round C test amplification: - Add positive tests for all remaining operators: Project, Aggregate, Distinct, OrderBy, Limit, Skip, Join, GraphToRows, RowsToGraph, PathProjection - Tighten TestErrorAccumulation: assert specific error messages present, not just count >= N; pin test_three_violations_in_one_node to exactly 3 errors 43 -> 53 verifier tests; 71 total IR tests passing Co-Authored-By: Claude Sonnet 4.6 <[email protected]>

…enting test - _inject(): remove unreachable try/except TypeError — dataclasses.replace() never raises TypeError for wrong-typed values (only unknown field names); use object.__setattr__ directly to bypass frozen=True guard - Convert test_listtype_shallow_check_only to xfail(strict=True) so future element_type recursion implementation auto-surfaces as xpass Co-Authored-By: Claude Sonnet 4.6 <[email protected]>

…ompat list[str] generic syntax in function annotations requires 3.9+; the __future__ import defers evaluation so it works on 3.8 too. Co-Authored-By: Claude Sonnet 4.6 <[email protected]>

Tests never instantiate CompilerError directly; verify() return values are accessed via .message. Pyright flagged this as "not accessed". Co-Authored-By: Claude Sonnet 4.6 <[email protected]>

lmeyerov · 2026-04-11T23:27:54Z

M2-PR3 audit update (spec-conformance review)

Main gaps to address before this verifier can be treated as M2-grade:

Predicate-scope invariant is not scope validation yet (high)

Current implementation checks only that predicate expression strings are non-empty.
It does not validate variable references against visible scope (input schema / pattern aliases).
Milestone M2 verifier contract expects dangling reference and predicate-in-scope checks.

Traversal has no cycle guard (medium)

_walk() is recursive without visited-set protection.
Malformed cyclic graphs can recurse indefinitely instead of returning deterministic diagnostics.

op_id=0 exemption can hide unassigned-id planner bugs (medium)

Current rule allows multiple op_id=0 nodes.
For planner-emitted plans in M2, we should fail when IDs are unassigned/non-positive (or add strict mode).

Known schema recursion gap retained (low)

ListType.element_type validation remains non-recursive and is intentionally xfailed.

I wrote the full local audit notes in:

/home/lmeyerov/Work/graphc/agents/plans/compiler-refactor/audit/pr1131-findings-2026-04-11.md

- _walk now tracks a path-ancestor set to detect true cycles (back-edges into the current DFS path), distinct from shared-subtree diamonds which are valid DAGs. Cycle edges are reported as CompilerErrors after traversal. - Added 3 cycle tests: self-loop, two-node cycle, diamond no-false-positive. - Expanded op_id=0 comment to explain the intentional planner-vs-fixture split. - Addresses findings #2 and #3 from pr1131-findings-2026-04-11.md. Co-Authored-By: Claude Sonnet 4.6 <[email protected]>

- Cycle error message: "already-visited" → "ancestor" (path set tracks ancestors, not all visited nodes; diamond nodes are visited but not ancestors) - Simplified test_two_node_cycle_caught: removed spurious _inject call, added clarifying comment about NodeScan + object.__setattr__ mechanism - test_dag_shared_child_no_false_positive: strengthened to assert verify()==[] (not just no cycle error), with comment explaining path-set pop correctness - Added test_cycle_error_message_mentions_ancestor: pins message content Co-Authored-By: Claude Sonnet 4.6 <[email protected]>

…ntend gate) cypher-frontend-strict-typing runs mypy --strict on all ir/*.py files. Bare `set` and `list` in _walk()'s signature fail [type-arg] under --strict. Replace with typing.Set[int], typing.List[...], typing.Tuple[str, str]. Co-Authored-By: Claude Sonnet 4.6 <[email protected]>

Optional PatternMatch arms produce NULL rows when the pattern is absent. ScalarType columns in the output_schema must be nullable=True to accommodate this. Added check + 4 tests covering: non-nullable caught, nullable ok, NodeRef (no flag) ok, non-optional non-nullable ok. Addresses finding #2 from pr1131-findings-2026-04-12-round2.md. Co-Authored-By: Claude Sonnet 4.6 <[email protected]>

lmeyerov and others added 7 commits April 11, 2026 13:36

test(gfql/ir/verifier): amplify — add ListType schema tests + documen…

393e09a

…t element_type gap Co-Authored-By: Claude Sonnet 4.6 <[email protected]>

fix(gfql/ir): add from __future__ import annotations for Python 3.8 c…

70c5000

…ompat list[str] generic syntax in function annotations requires 3.9+; the __future__ import defers evaluation so it works on 3.8 too. Co-Authored-By: Claude Sonnet 4.6 <[email protected]>

chore(gfql/ir): remove unused CompilerError import from verifier tests

22dcdf3

Tests never instantiate CompilerError directly; verify() return values are accessed via .message. Pyright flagged this as "not accessed". Co-Authored-By: Claude Sonnet 4.6 <[email protected]>

lmeyerov and others added 4 commits April 11, 2026 16:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(gfql/ir): LogicalPlan structural verifier (M2-PR3)#1131

feat(gfql/ir): LogicalPlan structural verifier (M2-PR3)#1131
lmeyerov wants to merge 11 commits intomasterfrom
m2-pr3-logicalplan-verifier

lmeyerov commented Apr 11, 2026

Uh oh!

lmeyerov commented Apr 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

lmeyerov commented Apr 11, 2026

Summary

Test plan

Non-goals (per issue)

Uh oh!

lmeyerov commented Apr 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant