fix: reward pipeline skips abandoned episodes (3 related fixes) by chiefmojo · Pull Request #1784 · MemTensor/MemOS

chiefmojo · 2026-05-22T12:27:01Z

Problem

The reward scoring pipeline processes only ~7 episodes on bootstrap and then goes permanently idle, leaving the vast majority of traces unscored. On a 43 MB database with 3,600 traces, only 45 (1.3%) had r_human scores before fixing.

Root Cause & Fixes

Three interacting bugs, all in apps/memos-local-plugin/:

Fix 1: `episodeRewardIsDirty()` excluded abandoned episodes

The dirty-check condition only matched closeReason === "finalized" or recoveryReason === "missed_session_end". 219 of 224 closed episodes had closeReason: "abandoned".

Fix: Add closeReason === "abandoned" to the rescore condition + a 10-minute periodic rescore timer.

Fix 2: Daemon HTTP server must bind before `core.init()`

The daemon bridge started core.init() before startHttpServer(). When init rescores dirty episodes (5+ minutes of LLM calls), port 18800 stays free. The Python watchdog times out its 15-second health probe and spawns a new daemon — killing the in-progress one.

Fix: In daemon mode, bind HTTP server first, then run init asynchronously.

Fix 3: `reward.skipped` blocked abandoned episodes from recovery

episodeRewardIsDirty() checked reward.skipped === true BEFORE closeReason === "abandoned". The reward runner correctly skipped 173 one-turn episodes in a previous session, but that flag permanently excluded them from recovery.

Fix: reward.skipped is only honored if the episode is NOT (abandoned + no prior recovery). Abandoned episodes get one pass; after that, closeReason is patched to "finalized" and the normal skip guard works.

Verification

Daemon bridge: 7+ hours uptime, zero restart loops
All 219 abandoned episodes processed
r_human scores flowing for traces from May 9 onward
Closes Reward pipeline skips abandoned episodes — 98% of closed episodes never scored #1782

- episodeRewardIsDirty() now includes closeReason=abandoned (219 of 224 closed episodes were silently skipped) - Added 10-min setInterval for autoRescoreDirtyClosedEpisodes() so the daemon bridge doesn"t go permanently idle after bootstrap

ensure_viewer_daemon() probes port 18800 with a 15-second timeout. When core.init() rescores dirty episodes it can take minutes, keeping the port unbound past the deadline. ensure_viewer_daemon() then gives up, releases the startup lock, and the next keepalive cycle spawns a replacement daemon that kills the in-progress one — creating a restart loop that interrupts scoring mid-batch. Fix: in daemon mode, bind the HTTP server first so the health probe succeeds within seconds, then run core.init() asynchronously in the background. Non-daemon (stdio/JSON-RPC) mode is unchanged — it still runs init synchronously so host-LLM fallback is available during recovery. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…ardIsDirty() reward.skipped=true was being set on abandoned episodes when the reward runner decided the conversation was too short (< 2 turns). This flag then permanently excluded them from recovery rescoring — episodeRewardIsDirty() returned false before the closeReason==="abandoned" check was reached, leaving 173 episodes stuck unscored indefinitely. Fix: only honor reward.skipped for episodes that are NOT (abandoned + no prior recovery attempt). recoverDirtyClosedEpisodes() patches closeReason → "finalized" after processing, so if reward still skips on recovery the next dirty check sees closeReason !== "abandoned" and stops retrying — exactly one recovery pass, no infinite loop. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…lagged episodes Episodes tagged lightweightMemory:true during a prior session when lightweight mode was active were permanently excluded from scoring even after the config was changed to enabled:false. Three guards enforced this unconditionally: the pre-filter in autoRescoreDirtyClosedEpisodes and init(), the skip inside recoverDirtyClosedEpisodes, and the check in the capture subscriber. All three now condition on the *current* handle.algorithm.lightweightMemory.enabled value. The snapshot emitted for a legacy-flagged episode also has the lightweightMemory field stripped before the event fires, so the capture subscriber receives a clean snapshot. When lightweight mode is on, behavior is unchanged. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

doranvas and others added 4 commits May 22, 2026 05:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: reward pipeline skips abandoned episodes (3 related fixes)#1784

fix: reward pipeline skips abandoned episodes (3 related fixes)#1784
chiefmojo wants to merge 4 commits into
MemTensor:mainfrom
chiefmojo:fix/abandoned-episode-scoring

chiefmojo commented May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

chiefmojo commented May 22, 2026

Problem

Root Cause & Fixes

Fix 1: episodeRewardIsDirty() excluded abandoned episodes

Fix 2: Daemon HTTP server must bind before core.init()

Fix 3: reward.skipped blocked abandoned episodes from recovery

Verification

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix 1: `episodeRewardIsDirty()` excluded abandoned episodes

Fix 2: Daemon HTTP server must bind before `core.init()`

Fix 3: `reward.skipped` blocked abandoned episodes from recovery