Skip to content

feat(memory): use event metadata for session/agent state identification#244

Open
jariy17 wants to merge 2 commits intomainfrom
feat/event-metadata-state-identification
Open

feat(memory): use event metadata for session/agent state identification#244
jariy17 wants to merge 2 commits intomainfrom
feat/event-metadata-state-identification

Conversation

@jariy17
Copy link
Contributor

@jariy17 jariy17 commented Jan 31, 2026

Summary

  • Replaces prefixed actorId approach (session_{session_id}, agent_{agent_id}) with event metadata for identifying session and agent state events
  • Adds automatic migration for backwards compatibility with existing sessions
  • Fixes pagination bug with metadata filters and handles eventual consistency

Fixes #220
Fixes #196

Changes

Event Metadata for State Identification

  • Added StateType enum (SESSION, AGENT) and metadata constants (STATE_TYPE_KEY, AGENT_ID_KEY)
  • Session and agent events now use metadata filters instead of prefixed actorIds
  • Cleaner separation of concerns - actorId represents the actual actor, not encoded state

Backwards Compatibility

  • Auto-migration: when legacy events are detected, they are migrated to the new format
  • Creates new event with metadata, deletes old event with prefixed actorId
  • Existing sessions continue to work without any code changes

Bug Fixes

  • Fixed pagination in list_events where API returns nextToken even with 0 results, causing metadata filter mismatch errors
  • Added _retry_with_backoff to handle eventual consistency when reading newly created agents
  • Track created agent IDs to handle updates during consistency window

Test plan

  • Unit tests pass (55 passing, 1 pre-existing failure unrelated to this PR)
  • End-to-end demo with real API completed successfully
  • Legacy migration tested with existing sessions

Future Work

The following improvements are planned for subsequent PRs:

Replace actorId prefix-based approach with event metadata for distinguishing
session and agent state events. Add auto-migration for legacy events.

Changes:
- Add StateType enum (SESSION, AGENT) and metadata keys
- Update create_session/create_agent to include stateType metadata
- Update read_session/read_agent to filter by metadata
- Add backwards-compatible auto-migration: legacy events are converted
  to new format on read (create new with metadata, delete old)
- Add tests for legacy migration behavior
… filters

- Fix pagination bug in list_events where API returns nextToken even with
  0 results, causing "metadata filter mismatch" error on subsequent page
- Add _retry_with_backoff method for handling eventual consistency when
  reading newly created agents via metadata filter
- Track created agent IDs to handle updates during consistency window
- Update test to account for retry behavior in legacy migration
@jariy17 jariy17 requested a review from a team January 31, 2026 18:54
},
)
# Track created agent for eventual consistency handling
self._created_agent_ids.add(session_agent.agent_id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to say that we support multiple agents per session ? Does Strands support that for session managers ? And also, this means we're using metadata to store and retrieve events from one agent_ID and not branches.

Copy link
Contributor Author

@jariy17 jariy17 Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_created_agent_ids is a fallback for when list_events_by_filter fails to find an agent due to eventual consistency. Newly created events aren't immediately visible when filtering by metadata - there's a short propagation delay. If the metadata query returns nothing but the agent_id is in _created_agent_ids, we know the agent was created and proceed accordingly rather than failing in update_agent

]

# Use retry with backoff to handle eventual consistency
events = self._retry_with_backoff(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we do retries for functions like read_agent ? It's causing unnecessary costs, maybe it's just a new agent that's empty.

Copy link
Contributor Author

@jariy17 jariy17 Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Retries are needed because newly created events aren't immediately visible when filtering by event_metadata. Without retries, we may miss agent state updates.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

2 participants