Skip to content

fix: persist session state across Bash tool calls for accurate telemetry#1050

Open
walton-chris wants to merge 4 commits intogarrytan:mainfrom
walton-chris:fix/preamble-shell-state-persist
Open

fix: persist session state across Bash tool calls for accurate telemetry#1050
walton-chris wants to merge 4 commits intogarrytan:mainfrom
walton-chris:fix/preamble-shell-state-persist

Conversation

@walton-chris
Copy link
Copy Markdown

Summary

The preamble sets _TEL_START and _SESSION_ID as shell variables,
but shell state does not survive between separate Bash tool
invocations (each Bash call is a fresh shell). When the final
## Telemetry (run last) block runs in a new shell, both variables
are empty — the arithmetic evaluates to 0 and the session id falls
back to "unknown".

Net effect: every timeline entry logs duration_s:"0" and
session:"unknown", regardless of how long the skill actually ran.
Session-correlation analytics are currently useless.

Evidence

Apex-coach office-hours session 2026-04-12:

{"skill":"office-hours","event":"started","branch":"master",
 "session":"10269-1775994715","ts":"2026-04-12T11:51:56.513Z"}
{"skill":"office-hours","event":"completed","branch":"master",
 "outcome":"success","duration_s":"0","session":"unknown",
 "ts":"2026-04-12T11:59:40.368Z"}

~8 real minutes, logged as 0. Session ids don't correlate. Every
other completed timeline entry across the install has the same
signature.

Fix

One edit to scripts/resolvers/preamble.ts:

  1. After setting _TEL_START and _SESSION_ID, write a small state
    file to ~/.gstack/analytics/.session-state via heredoc.
  2. At the top of the telemetry block, source the state file (using
    . not source for POSIX portability).
  3. Compute duration with \${_TEL_START:-\$_TEL_END} fallback so
    missing-file scenarios gracefully degrade to the current behavior
    rather than erroring.
  4. Remove the state file after telemetry fires.

All SKILL.md files regenerated via bun run gen:skill-docs --host all.
Golden-file fixtures (claude/codex/factory ship) updated.

Scope: preamble-tier 3 skills pick up the fix via regeneration.
Minimal-preamble skills (freeze, guard, careful, unfreeze,
gstack-upgrade) don't include the telemetry block at all and are
unaffected.

Test plan

  • Run any skill, confirm ~/.gstack/analytics/.session-state
    is written during preamble
  • Confirm duration_s in the skill's timeline.jsonl event
    is a non-zero number matching real session duration
  • Confirm session is the PID-timestamp format set in preamble
    (not "unknown")
  • Delete the state file mid-session, confirm telemetry degrades
    gracefully (duration 0, no crash)
  • Confirm state file is cleaned up after the skill completes

Relationship to #1049

PR #1049 (office-hours artifact verification) is orthogonal but
complementary. Once both land, office-hours correctly tracks duration
AND correctly distinguishes successful sessions (artifact written)
from silent failures (no artifact). This PR is generic; that one is
office-hours-specific.

Commit history on this branch

4 commits. First is the substantive fix; the rest are regen/fixture
updates needed to land the first commit cleanly. Garry's usual
squash-merge pattern will collapse them on merge.

Generated with Claude Code

walton-chris and others added 4 commits April 17, 2026 11:37
The preamble sets _TEL_START and _SESSION_ID as shell variables, but
shell state doesn't survive between separate Bash tool invocations.
When the final "## Telemetry (run last)" block runs in a new shell,
both vars are empty — arithmetic evaluates to 0 and the session id
falls back to "unknown". Result: every timeline entry logs
`duration_s:"0"` and `session:"unknown"`, regardless of real duration.

Fix: the preamble now writes a small state file to
`~/.gstack/analytics/.session-state` with `_TEL_START`, `_SESSION_ID`,
`_TEL`, and `_BRANCH`. The telemetry block sources it before computing
duration, uses `${_TEL_START:-$_TEL_END}` as a safe fallback if the
file is missing, and removes the state file after telemetry fires.

Scope: one edit to scripts/resolvers/preamble.ts. Regenerating via
`bun run gen:skill-docs` rolls the fix into every skill with
preamble-tier 3 (all telemetry-emitting skills). Minimal-preamble
skills (freeze, guard, careful, unfreeze, gstack-upgrade) have no
telemetry block and are unaffected.

Evidence this matters: an apex-coach office-hours run recorded
`{"duration_s":"0","session":"unknown","outcome":"success",...}`
despite lasting ~8 minutes. All the timeline entries I audited
locally have the same "0 / unknown" signature.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
The three golden-file regression tests in host-config.test.ts compare
committed ship SKILL.md files (claude, codex, factory hosts) against
snapshots in test/fixtures/golden/. The previous commit changed the
preamble, which bubbles into every host's ship output, so the golden
snapshots were stale and CI failed.

Regenerated via `bun run gen:skill-docs --host all` and copied:
- ship/SKILL.md                        → claude-ship-SKILL.md
- .agents/skills/gstack-ship/SKILL.md  → codex-ship-SKILL.md
- .factory/skills/gstack-ship/SKILL.md → factory-ship-SKILL.md

Verified locally: `bun test test/host-config.test.ts` passes all
three golden-file tests. `gen:skill-docs --dry-run --host all`
reports FRESH for every generated file.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
CI's Skill Docs Freshness check compares committed SKILL.md files
against what `gen:skill-docs --host all` produces. The earlier commit
in this PR only regenerated files I identified as containing my
preamble change, missing broader generator-driven drift that already
existed on origin/main (voice-triggers folding into description,
`sensitive:` flag corrections, etc.).

Running `bun run gen:skill-docs --host all` and staging every
modified generated file (31 SKILL.md + 4 openclaw skill variants)
produces a tree where `--dry-run` reports 0 STALE.

Functional scope is unchanged — the only intentional change remains
the preamble session-state persistence in the previous commit. This
commit is purely the regen delta.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
The previous regen in this PR ran on Windows with core.autocrlf=true,
which meant working-tree files had CRLF line endings. The generator's
processVoiceTriggers() regex uses \n, so it silently no-ops on CRLF
content — leaving voice-triggers as YAML lists instead of folding them
into the description string. CI (Linux, LF) produces the folded form,
so its output differed from what I committed.

Fix: set core.autocrlf=input, reset to HEAD with LF, then regenerate.
Dry-run now reports 0 STALE locally, matching CI's expectation.

Note: this points at a cross-platform portability bug in the generator
itself (the regex should accept \r\n). Worth a follow-up PR, but out
of scope here.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant