Skip to content

Conversation

@smoreinis
Copy link
Collaborator

@smoreinis smoreinis commented Dec 25, 2025

Summary

Optimize all PostgreSQL create operations by using PostgreSQL's RETURNING clause to get auto-generated fields in a single query instead of two.

Changes

  1. Event Repository - Use INSERT RETURNING to get auto-generated sequence_id in single query
  2. Base PostgresCRUDRepository - Apply same optimization to all PostgreSQL repositories (agents, spans, tasks, etc.)

Before (2 queries per create)

session.add(orm)
await session.commit()
await session.refresh(orm)  # Extra SELECT to get auto-generated fields

After (1 query per create)

# Return explicit columns to avoid lazy-loading relationships
stmt = insert(self.orm).values(**values).returning(*self.orm.__table__.columns)
result = await session.execute(stmt)
row = result.one()
await session.commit()
return self.entity.model_validate(dict(row._mapping))

Technical Details

  • Uses returning(*columns) instead of returning(ORM) to avoid lazy-loading relationships on returned ORM objects
  • Excludes None values from INSERT to preserve server defaults (e.g., created_at)
  • Works correctly with all ORM models including those with relationships (AgentORM, TaskORM)

Benchmark Results (Events)

Tested with 200 iterations on docker-to-docker network:

Metric OLD (add+refresh) NEW (RETURNING) Improvement
Mean latency 0.812ms 0.684ms 15.8% faster
Median 0.780ms 0.644ms 17.4% faster
P95 1.050ms 0.887ms 15.5% faster
P99 1.780ms 1.009ms 43.3% faster
Throughput 1,231 events/sec 1,462 events/sec +231 events/sec

Note: Docker-to-docker latency is ~0.1ms. In production with typical network latency (1-5ms), the improvement would be more significant since we're eliminating an entire network round-trip per create operation.

Test plan

  • All 15 repository tests pass
  • All 184 unit tests pass
  • Verified auto-generated fields (sequence_id, created_at) correctly populated
  • Tested with ORM models that have relationships (AgentORM with tasks relationship)
  • Benchmarked locally with 200 iterations

Replace session.add() + session.refresh() with insert().returning()
to get the auto-generated sequence_id in a single database round-trip.

Before: 2 queries per event (INSERT + SELECT)
After: 1 query per event (INSERT ... RETURNING *)

Expected ~33% reduction in database queries for event writes.
@smoreinis smoreinis requested a review from a team as a code owner December 25, 2025 00:10
Extends the single-query INSERT RETURNING pattern to all PostgreSQL
repositories by optimizing the base class create() method.

Key changes:
- Use insert().values().returning() instead of add() + refresh()
- Return explicit columns to avoid lazy-loading relationships
- Exclude None values to preserve server defaults (e.g., created_at)

This reduces database round-trips from 2 to 1 for all PostgreSQL
create operations across agents, spans, tasks, and other entities.
@smoreinis smoreinis changed the title perf: Use INSERT RETURNING for event creation perf: Use INSERT RETURNING for all PostgreSQL create operations Dec 29, 2025
@smoreinis smoreinis merged commit c575755 into main Dec 29, 2025
6 checks passed
@smoreinis smoreinis deleted the perf/event-insert-returning branch December 29, 2025 20:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants