Problem
utils/jsonl_parser.py (~761 lines) combines three distinct responsibilities:
- JSONL line-level parsing —
parse_session() read loop, message processors, metadata assembly
- Tool-result dispatch — 15-predicate ordered table
_TOOL_RESULT_DISPATCH (lines ~570–586) where first match wins; ordering is load-bearing and documented only with inline comments
- Session peek optimization —
quick_session_info() for fast title/timestamp without full parse
This coupling makes dispatch ordering harder to reason about, increases merge-conflict risk, and raises the cost of adding new tool-result shapes. Tuesday's real-session fixtures and dispatch-order regression tests exist specifically to guard this extraction.
Goal
Split the monolith into at least three focused modules with clear single responsibilities, without changing public imports or runtime behavior.
Prerequisites
- Monday PR2 merged — runtime validation at JSONL boundary
- Tuesday fixtures merged —
tests/test_real_session_fixtures.py dispatch-order tests must pass before and after split
- Wednesday export work optional (imports
parse_session only)
git checkout master
git pull
git checkout -b refactor/jsonl-parser-split
Scope
New modules (minimum)
| Module |
Responsibility |
utils/tool_dispatch.py |
All _tool_result_pred_*, _tool_result_build_*, _TOOL_RESULT_DISPATCH, _parse_tool_result(); module header documents ordering contract |
utils/session_peek.py |
quick_session_info() — two-pass metadata peek |
utils/jsonl_parser.py (slimmed) |
parse_session(), message processors (_process_user/assistant/system/progress), metadata assembly, validation return |
Optional:
utils/jsonl_helpers.py — shared content helpers (_normalize_content, _extract_text, _strip_system_tags, etc.) if needed to avoid cyclic imports between parser and peek modules
utils/jsonl_reader.py — raw line I/O only if it clarifies boundaries (not required if parse_session stays readable)
Backward compatibility (required)
Re-export from utils/jsonl_parser.py so existing imports unchanged:
from utils.jsonl_parser import parse_session # api/, scripts/
from utils.jsonl_parser import quick_session_info # api/projects.py
from utils.jsonl_parser import _parse_tool_result # tests/
from utils.jsonl_parser import _TOOL_RESULT_DISPATCH # Tuesday fixture tests
from utils.jsonl_parser import _strip_system_tags # utils/md_exporter.py
No test file edits — if tests pass unmodified, re-exports are correct.
Constraints
- No behavior change — mechanical move only; dispatch predicate order unchanged
- No cyclic imports between new modules
- No public API signature changes on
parse_session() or quick_session_info()
- Update
docs/architecture.md dispatch location only if it references the old single-file layout (one-line touch acceptable)
Out of Scope
- New tool-result types or dispatch reordering
- Runtime validation changes (
utils/validation.py)
- New fixtures or parser behavior changes
- Export engine changes (Wednesday)
- Broad documentation rewrite
Acceptance Criteria
Problem
utils/jsonl_parser.py(~761 lines) combines three distinct responsibilities:parse_session()read loop, message processors, metadata assembly_TOOL_RESULT_DISPATCH(lines ~570–586) where first match wins; ordering is load-bearing and documented only with inline commentsquick_session_info()for fast title/timestamp without full parseThis coupling makes dispatch ordering harder to reason about, increases merge-conflict risk, and raises the cost of adding new tool-result shapes. Tuesday's real-session fixtures and dispatch-order regression tests exist specifically to guard this extraction.
Goal
Split the monolith into at least three focused modules with clear single responsibilities, without changing public imports or runtime behavior.
Prerequisites
tests/test_real_session_fixtures.pydispatch-order tests must pass before and after splitparse_sessiononly)Scope
New modules (minimum)
utils/tool_dispatch.py_tool_result_pred_*,_tool_result_build_*,_TOOL_RESULT_DISPATCH,_parse_tool_result(); module header documents ordering contractutils/session_peek.pyquick_session_info()— two-pass metadata peekutils/jsonl_parser.py(slimmed)parse_session(), message processors (_process_user/assistant/system/progress), metadata assembly, validation returnOptional:
utils/jsonl_helpers.py— shared content helpers (_normalize_content,_extract_text,_strip_system_tags, etc.) if needed to avoid cyclic imports between parser and peek modulesutils/jsonl_reader.py— raw line I/O only if it clarifies boundaries (not required ifparse_sessionstays readable)Backward compatibility (required)
Re-export from
utils/jsonl_parser.pyso existing imports unchanged:No test file edits — if tests pass unmodified, re-exports are correct.
Constraints
parse_session()orquick_session_info()docs/architecture.mddispatch location only if it references the old single-file layout (one-line touch acceptable)Out of Scope
utils/validation.py)Acceptance Criteria
utils/jsonl_parser.pysplit into ≥3 modules with clear single responsibilities (tool_dispatch.py,session_peek.py, slimmedjsonl_parser.py)utils.jsonl_parsercontinue to work via re-exportstests/test_jsonl_parser.py)tests/test_real_session_fixtures.py)mypy --strictpasses onutils(and dependent packages)pytest -qgreen in CI