feat(gfql/ir): LogicalPlan structural verifier (M2-PR3)#1131
Open
feat(gfql/ir): LogicalPlan structural verifier (M2-PR3)#1131
Conversation
Adds `graphistry/compute/gfql/ir/verifier.py` with `verify(plan) -> list[CompilerError]` implementing five structural invariants via full tree walk: 1. op_id uniqueness (non-zero ids must be distinct) 2. Dangling reference detection (child slots must be LogicalPlan | None) 3. Predicate scope validation (PatternMatch predicates need non-empty expressions) 4. Output schema consistency (RowSchema columns must hold LogicalType instances) 5. Optional-arm nullability contract (optional PatternMatch must carry arm_id) Exports `verify` from `ir/__init__.py`. 23 focused tests (positive + negative). Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
…pe, test coverage Fixes from post-implementation spec audit (Round A): - Fix _MISSING sentinel forward-ref: move above _children() where it's used - Extract _CHILD_SLOTS constant to eliminate duplicate tuple literal (DRY) - Extend predicate scope check (invariant 3) to Filter.predicate and IndexScan.predicate + residual_predicates, not just PatternMatch - Add _check_predicate() helper to deduplicate single-predicate error logic - Expand test suite: 23 -> 41 tests covering EdgeScan, Apply, SemiApply, AntiSemiApply, Unwind, IndexScan, dangling refs on left/right/subquery, Filter predicate scope, multi-predicate mixed-empty, multi-violation accumulation, PathType schema validation - Replace fragile object.__setattr__ test fixture with _inject() helper that uses dataclasses.replace() as the primary path Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
…t element_type gap Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
… assertions Round C test amplification: - Add positive tests for all remaining operators: Project, Aggregate, Distinct, OrderBy, Limit, Skip, Join, GraphToRows, RowsToGraph, PathProjection - Tighten TestErrorAccumulation: assert specific error messages present, not just count >= N; pin test_three_violations_in_one_node to exactly 3 errors 43 -> 53 verifier tests; 71 total IR tests passing Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
…enting test - _inject(): remove unreachable try/except TypeError — dataclasses.replace() never raises TypeError for wrong-typed values (only unknown field names); use object.__setattr__ directly to bypass frozen=True guard - Convert test_listtype_shallow_check_only to xfail(strict=True) so future element_type recursion implementation auto-surfaces as xpass Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
…ompat list[str] generic syntax in function annotations requires 3.9+; the __future__ import defers evaluation so it works on 3.8 too. Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
Tests never instantiate CompilerError directly; verify() return values are accessed via .message. Pyright flagged this as "not accessed". Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
Contributor
Author
|
M2-PR3 audit update (spec-conformance review) Main gaps to address before this verifier can be treated as M2-grade:
I wrote the full local audit notes in:
|
- _walk now tracks a path-ancestor set to detect true cycles (back-edges into the current DFS path), distinct from shared-subtree diamonds which are valid DAGs. Cycle edges are reported as CompilerErrors after traversal. - Added 3 cycle tests: self-loop, two-node cycle, diamond no-false-positive. - Expanded op_id=0 comment to explain the intentional planner-vs-fixture split. - Addresses findings #2 and #3 from pr1131-findings-2026-04-11.md. Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
- Cycle error message: "already-visited" → "ancestor" (path set tracks ancestors, not all visited nodes; diamond nodes are visited but not ancestors) - Simplified test_two_node_cycle_caught: removed spurious _inject call, added clarifying comment about NodeScan + object.__setattr__ mechanism - test_dag_shared_child_no_false_positive: strengthened to assert verify()==[] (not just no cycle error), with comment explaining path-set pop correctness - Added test_cycle_error_message_mentions_ancestor: pins message content Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
…ntend gate) cypher-frontend-strict-typing runs mypy --strict on all ir/*.py files. Bare `set` and `list` in _walk()'s signature fail [type-arg] under --strict. Replace with typing.Set[int], typing.List[...], typing.Tuple[str, str]. Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
Optional PatternMatch arms produce NULL rows when the pattern is absent. ScalarType columns in the output_schema must be nullable=True to accommodate this. Added check + 4 tests covering: non-nullable caught, nullable ok, NodeRef (no flag) ok, non-optional non-nullable ok. Addresses finding #2 from pr1131-findings-2026-04-12-round2.md. Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #1127
Summary
graphistry/compute/gfql/ir/verifier.pywithverify(plan: LogicalPlan) -> list[CompilerError]op_ids must be distinct across the whole tree (0 = unassigned, exempt)input/left/right/subqueryslots must holdLogicalPlan | NonePatternMatchpredicates must carry non-emptyexpressionstringsRowSchemacolumn values must beLogicalTypeinstancesPatternMatch(optional=True)must carry a non-emptyarm_idverifyfromir/__init__.pyTest plan
pytest graphistry/tests/compute/gfql/test_ir_verifier.py— 23/23 pass./bin/ruff.sh— clean./bin/typecheck.sh graphistry/compute/gfql/ir/verifier.py— clean (mypy: no issues)Non-goals (per issue)
🤖 Generated with Claude Code