fix: persist session state across Bash tool calls for accurate telemetry#1050
Open
walton-chris wants to merge 4 commits intogarrytan:mainfrom
Open
fix: persist session state across Bash tool calls for accurate telemetry#1050walton-chris wants to merge 4 commits intogarrytan:mainfrom
walton-chris wants to merge 4 commits intogarrytan:mainfrom
Conversation
The preamble sets _TEL_START and _SESSION_ID as shell variables, but
shell state doesn't survive between separate Bash tool invocations.
When the final "## Telemetry (run last)" block runs in a new shell,
both vars are empty — arithmetic evaluates to 0 and the session id
falls back to "unknown". Result: every timeline entry logs
`duration_s:"0"` and `session:"unknown"`, regardless of real duration.
Fix: the preamble now writes a small state file to
`~/.gstack/analytics/.session-state` with `_TEL_START`, `_SESSION_ID`,
`_TEL`, and `_BRANCH`. The telemetry block sources it before computing
duration, uses `${_TEL_START:-$_TEL_END}` as a safe fallback if the
file is missing, and removes the state file after telemetry fires.
Scope: one edit to scripts/resolvers/preamble.ts. Regenerating via
`bun run gen:skill-docs` rolls the fix into every skill with
preamble-tier 3 (all telemetry-emitting skills). Minimal-preamble
skills (freeze, guard, careful, unfreeze, gstack-upgrade) have no
telemetry block and are unaffected.
Evidence this matters: an apex-coach office-hours run recorded
`{"duration_s":"0","session":"unknown","outcome":"success",...}`
despite lasting ~8 minutes. All the timeline entries I audited
locally have the same "0 / unknown" signature.
Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
The three golden-file regression tests in host-config.test.ts compare committed ship SKILL.md files (claude, codex, factory hosts) against snapshots in test/fixtures/golden/. The previous commit changed the preamble, which bubbles into every host's ship output, so the golden snapshots were stale and CI failed. Regenerated via `bun run gen:skill-docs --host all` and copied: - ship/SKILL.md → claude-ship-SKILL.md - .agents/skills/gstack-ship/SKILL.md → codex-ship-SKILL.md - .factory/skills/gstack-ship/SKILL.md → factory-ship-SKILL.md Verified locally: `bun test test/host-config.test.ts` passes all three golden-file tests. `gen:skill-docs --dry-run --host all` reports FRESH for every generated file. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
CI's Skill Docs Freshness check compares committed SKILL.md files against what `gen:skill-docs --host all` produces. The earlier commit in this PR only regenerated files I identified as containing my preamble change, missing broader generator-driven drift that already existed on origin/main (voice-triggers folding into description, `sensitive:` flag corrections, etc.). Running `bun run gen:skill-docs --host all` and staging every modified generated file (31 SKILL.md + 4 openclaw skill variants) produces a tree where `--dry-run` reports 0 STALE. Functional scope is unchanged — the only intentional change remains the preamble session-state persistence in the previous commit. This commit is purely the regen delta. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
The previous regen in this PR ran on Windows with core.autocrlf=true, which meant working-tree files had CRLF line endings. The generator's processVoiceTriggers() regex uses \n, so it silently no-ops on CRLF content — leaving voice-triggers as YAML lists instead of folding them into the description string. CI (Linux, LF) produces the folded form, so its output differed from what I committed. Fix: set core.autocrlf=input, reset to HEAD with LF, then regenerate. Dry-run now reports 0 STALE locally, matching CI's expectation. Note: this points at a cross-platform portability bug in the generator itself (the regex should accept \r\n). Worth a follow-up PR, but out of scope here. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
3 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The preamble sets
_TEL_STARTand_SESSION_IDas shell variables,but shell state does not survive between separate Bash tool
invocations (each Bash call is a fresh shell). When the final
## Telemetry (run last)block runs in a new shell, both variablesare empty — the arithmetic evaluates to 0 and the session id falls
back to
"unknown".Net effect: every timeline entry logs
duration_s:"0"andsession:"unknown", regardless of how long the skill actually ran.Session-correlation analytics are currently useless.
Evidence
Apex-coach office-hours session 2026-04-12:
~8 real minutes, logged as 0. Session ids don't correlate. Every
other
completedtimeline entry across the install has the samesignature.
Fix
One edit to
scripts/resolvers/preamble.ts:_TEL_STARTand_SESSION_ID, write a small statefile to
~/.gstack/analytics/.session-statevia heredoc..notsourcefor POSIX portability).\${_TEL_START:-\$_TEL_END}fallback somissing-file scenarios gracefully degrade to the current behavior
rather than erroring.
All SKILL.md files regenerated via
bun run gen:skill-docs --host all.Golden-file fixtures (claude/codex/factory ship) updated.
Scope: preamble-tier 3 skills pick up the fix via regeneration.
Minimal-preamble skills (freeze, guard, careful, unfreeze,
gstack-upgrade) don't include the telemetry block at all and are
unaffected.
Test plan
~/.gstack/analytics/.session-stateis written during preamble
duration_sin the skill's timeline.jsonl eventis a non-zero number matching real session duration
sessionis the PID-timestamp format set in preamble(not
"unknown")gracefully (duration 0, no crash)
Relationship to #1049
PR #1049 (office-hours artifact verification) is orthogonal but
complementary. Once both land, office-hours correctly tracks duration
AND correctly distinguishes successful sessions (artifact written)
from silent failures (no artifact). This PR is generic; that one is
office-hours-specific.
Commit history on this branch
4 commits. First is the substantive fix; the rest are regen/fixture
updates needed to land the first commit cleanly. Garry's usual
squash-merge pattern will collapse them on merge.
Generated with Claude Code