Skip to content

[codex-analytics] emit terminal tool review events#18748

Draft
rhan-oai wants to merge 1 commit intopr18747from
pr18748
Draft

[codex-analytics] emit terminal tool review events#18748
rhan-oai wants to merge 1 commit intopr18747from
pr18748

Conversation

@rhan-oai
Copy link
Copy Markdown
Collaborator

@rhan-oai rhan-oai commented Apr 20, 2026

Copy link
Copy Markdown
Contributor

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0c8f41eab9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +801 to +804
let status = match response.scope {
PermissionGrantScope::Turn => ToolReviewStatus::Approved,
PermissionGrantScope::Session => ToolReviewStatus::ApprovedForSession,
};
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Classify empty permission grants as denied reviews

PermissionsRequestApproval maps status from scope only, so Turn is always emitted as approved. But UI denial paths return an empty permissions profile with scope=Turn (see approval_overlay::handle_permissions_decision). That causes denied/timed-out permission reviews to be reported as approvals and denormalized into incorrect final_approval_outcome values on tool item events.

Useful? React with 👍 / 👎.

Comment on lines +1052 to +1057
let started = self.guardian_review_starts.remove(&notification.review_id);
let created_at = started
.as_ref()
.map(|start| start.created_at)
.unwrap_or(completed_at);
let Some(status) = guardian_review_status(notification.review.status) else {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Deduplicate guardian review completion notifications

This handler emits a review even when guardian_review_starts.remove(review_id) returns None. Because notifications are tracked per connection, multi-client threads ingest the same completion once per connection, creating duplicate codex_tool_call_review_event records and inflating per-item review counters.

Useful? React with 👍 / 👎.

@rhan-oai rhan-oai force-pushed the pr18748 branch 2 times, most recently from ec96f8b to 1fc452a Compare April 30, 2026 19:32
@rhan-oai rhan-oai force-pushed the pr18747 branch 2 times, most recently from 565e39a to 66f2072 Compare April 30, 2026 21:12
@rhan-oai rhan-oai force-pushed the pr18748 branch 2 times, most recently from 6617848 to 535ea75 Compare May 1, 2026 00:17
rhan-oai added a commit that referenced this pull request May 1, 2026
## Why

Several analytics event families need the same per-thread attribution
state: the app-server client/runtime associated with a thread and, for
lifecycle-oriented events, the thread metadata captured during
initialization. Keeping connection ids and lifecycle metadata in
separate maps made each consumer rebuild the same thread context and
made subagent attribution harder to resolve consistently.

## What changed

- Replaces the separate thread connection and metadata maps with one
reducer-owned `threads` map.
- Routes guardian, compaction, turn-steer, and turn analytics through
shared thread-state lookups while preserving turn-origin attribution for
turn events and request-origin attribution for steer events.
- Lets newly observed spawned subagent threads inherit their parent
thread connection so later thread-scoped analytics can resolve through
the same state model.
- Adds regression coverage for standalone `SubAgentThreadStarted`
publication plus the `SubAgentSource::ThreadSpawn` parent fallback
through a thread-scoped consumer that depends on inherited connection
state.

## Verification

- `cargo test -p codex-analytics`

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/20300).
* #18748
* #18747
* #17090
* #17089
* #20239
* #20515
* #20514
* __->__ #20300
@rhan-oai rhan-oai force-pushed the pr18748 branch 2 times, most recently from a51cdee to e70e29a Compare May 3, 2026 03:38
manoelcalixto added a commit to Electivus/electivus-codex that referenced this pull request May 3, 2026
* [codex-analytics] ingest server requests and responses (openai#17088)

## Why

Codex analytics needs a typed seam for app-server-originated
request/response traffic so future tool-approval analytics can consume
those facts without adding bespoke callsite tracking each time. Server
responses arrive as JSON-RPC `id + result` payloads, so analytics has to
reconstruct the matching typed response from the original typed request
while that request context still exists in app-server.

This also puts analytics on the app-server outbound path, which needs to
avoid keeping the runtime alive during shutdown. The final ownership fix
keeps the normal strong auth-manager retention in analytics and makes
the external-auth refresh bridge hold a weak back-reference to
`OutgoingMessageSender`, breaking the runtime cycle at the bridge
boundary instead of exposing retention policy through the analytics
client API.

## What changed

- Adds typed `ServerRequest` and `ServerResponse` analytics facts, plus
`AnalyticsEventsClient::track_server_request` and
`track_server_response`.
- Renames the existing client-side facts to `ClientRequest` and
`ClientResponse` so reducers can distinguish client-to-server traffic
from server-to-client traffic.
- Adds `ServerRequest::response_from_result`, allowing a stored typed
request to decode the matching typed server response from a raw JSON-RPC
result payload.
- Threads `AnalyticsEventsClient` through `OutgoingMessageSender` and
records targeted server requests, replayed targeted requests, and
matching targeted responses with the responding connection id needed for
correlation.
- Intentionally leaves broadcast server requests/responses out of
analytics for now because the current model is per connection, while
broadcasts fan one logical request out across multiple connections.
- Breaks the app-server shutdown cycle by storing
`Weak<OutgoingMessageSender>` in `ExternalAuthRefreshBridge` and
upgrading it only when an external-auth refresh is actually requested.
- Keeps reducer ingestion of the new server-side facts as no-ops for
now; this PR is plumbing for later tool-approval analytics work.

## Verification

- `cargo test -p codex-analytics`
- `cargo test -p codex-app-server outgoing_message::tests::`
- Covers typed-response reconstruction plus the targeted, replayed,
broadcast-exclusion, and response-attribution analytics paths.

## Follow-up

This PR intentionally stops at ingestion plumbing, so `ServerRequest`
and `ServerResponse` facts are still reducer no-ops. Once a follow-up PR
adds real downstream analytics output for those facts:

- replace the temporary pre-reducer observation seam with reducer tests
for the emitted event shape;
- add end-to-end coverage in `app-server/tests/suite/v2/analytics.rs`
for the real app-server workflow and captured analytics payload;
- remove the temporary sender-level observer tests added here in favor
of the real-output coverage above.

---

[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/17088).
* openai#18748
* openai#18747
* openai#17090
* openai#17089
* openai#20241
* openai#20239
* __->__ openai#17088

* [tool_suggest] Improve tool_suggest triggering conditions. (openai#20091)

## Summary
- Tighten `tool_suggest` guidance so it prefers explicit plugin install
requests, while still allowing a connector install when the relevant
plugin is already installed and a needed connector from that plugin is
missing.
- Tell the model not to call `tool_suggest` in parallel with other
tools.

## Testing
- `cargo test -p codex-tools tool_suggest`
- `cargo test -p codex-core tool_suggest`

* app-server: fix outgoing sender test setup (openai#20258)

## Why

[openai#17088](openai#17088) changed
`OutgoingMessageSender::new` to require an `AnalyticsEventsClient`, but
one `command_exec` test added earlier on `main` still called the old
one-argument constructor. That leaves current `main` failing to compile
in Bazel and argument-comment-lint jobs.

## What changed

- Pass `AnalyticsEventsClient::disabled()` to the missed
`OutgoingMessageSender::new` test call site in `command_exec.rs`.

## Verification

- `cargo test -p codex-app-server
timeout_or_cancellation_reports_cancellation_without_timeout_exit_code`

* [app-server] type client response payloads (openai#20050)

## Why

`pr17088` adds typed server-originated request/response plumbing, but
successful client responses are still erased into bare JSON-RPC `result`
values before app-server can make any typed decision about them.

This precursor PR keeps successful client responses typed until the
outgoing response seam. It is intentionally limited to
protocol/app-server plumbing so the analytics behavior change can review
separately on top.

## What changed

- Add `ClientResponsePayload` as the pre-serialization client response
body type.
- Route app-server successful response paths through the typed payload
seam while preserving existing handler-local analytics behavior.
- Keep `InterruptConversation` JSON-RPC-only because it has no
`ClientResponse` variant.
- Move the new payload conversion tests into a dedicated protocol test
module.

## Verification

- `cargo check -p codex-app-server`
- `cargo test -p codex-app-server-protocol`

* Require remote plugin detail before uninstall (openai#19966)

## Summary
- Fetch remote plugin detail before sending the uninstall request.
- Use the detail response to derive the marketplace namespace and plugin
name for cache cleanup.
- Stop the uninstall before the backend POST if detail lookup fails, so
backend state and local cache state do not diverge.

## Testing
- `just fmt`
- `cargo test -p codex-app-server plugin_uninstall`
- `cargo test -p codex-core-plugins`
- `git diff --check`

* [app-server] centralize client response analytics (openai#20059)

## Why

The precursor PR keeps successful client responses typed until
app-server's outgoing response seam. This follow-up uses that seam to
move successful client-response analytics out of individual handlers and
into the shared sender path, while keeping filtering decisions inside
`codex-analytics`.

## What changed

- Emit successful client-response analytics centrally from
`OutgoingMessageSender::send_response`.
- Remove duplicate handler-local response tracking for the current
thread/turn lifecycle responses.
- Keep analytics ingestion selective inside `AnalyticsEventsClient`, so
unrelated client traffic is ignored before cloning or boxing.
- Collapse client-response analytics facts onto one typed path and
normalize payloads in the reducer.
- Add direct client-filter coverage plus sender-level coverage for the
centralized forwarding path.

## Verification

- `cargo test -p codex-analytics`
- `cargo test -p codex-app-server outgoing_message::tests --lib`

* Fallback login callback port when default is busy (openai#19334)

## Summary
- Keep the preferred ChatGPT login callback port `1455` first.
- Preserve the existing `/cancel` recovery for stale Codex login
servers.
- Fall back to the registered localhost callback port `1457` when `1455`
remains unavailable.

## Why
Cursor and Codex Desktop both use the ChatGPT account login callback
server. On Windows, Cursor can already be listening on `127.0.0.1:1455`
/ `[::1]:1455`, causing Codex Desktop sign-in to fail with:

`Local callback port 1455 is already in use on this machine.`

Codex already attempted to cancel a stale Codex login server on that
port, but if the listener does not release the port, the old behavior
was to fail. The new behavior falls back to `1457`, which matches the
fixed redirect URI being registered server-side in
`openai/openai#863817`. This keeps the OAuth `redirect_uri` inside
Hydra's exact allow-list instead of choosing an arbitrary ephemeral
port.

## Validation
- `just fmt`
- `cargo test -p codex-login`
- `git diff --check HEAD~1..HEAD`

* [apps] Add apps MCP path override (openai#20231)

Summary

- Add `[features.apps_mcp_path_override]` config with a `path` field for
overriding only the built-in apps MCP path.
- Keep existing host/base URL derivation unchanged and append the
configured path after that base.
- Regenerate the config schema with the custom feature-config case.

Test Plan

- Not run for latest revision; only `just fmt` and `just
write-config-schema` were run.
- Earlier revision: `cargo test -p codex-features`
- Earlier revision: `cargo test -p codex-mcp`

* docs: discourage `#[async_trait]` and `#[allow(async_fn_in_trait)]` (openai#20242)

## Why

We have run into two avoidable problems when introducing async trait
APIs in Rust:

- `#[async_trait]` has caused materially worse build times in this
repository.
- `#[allow(async_fn_in_trait)]` makes it too easy to ship a public trait
without spelling out whether the returned future is `Send`, which hides
an important part of the trait contract.

We already have a good example of the preferred alternative in
[openai#16630](openai#16630) /
[`3c7f013f9735`](openai@3c7f013f9735),
but that guidance currently lives only as prior art in the codebase.
This PR documents the rule in `AGENTS.md` so contributors are more
likely to follow the native RPITIT pattern before these two shortcuts
spread further.

## What Changed

- added Rust guidance in `AGENTS.md` discouraging both `#[async_trait]`
and `#[allow(async_fn_in_trait)]`
- pointed contributors to the native RPITIT pattern with explicit `Send`
bounds on the returned future
- clarified that implementations may still use `async fn` when they
satisfy that trait contract

## Verification

- docs-only change; no tests run

* Escape turn metadata headers as ASCII JSON (openai#19620)

## Why

`x-codex-turn-metadata` is sent as an HTTP/WebSocket header, but Codex
was serializing the metadata JSON with raw UTF-8 string contents. When a
workspace path contains non-ASCII characters, common HTTP stacks can
reject or corrupt that header before the request reaches the provider.

Fixes openai#17468. Also addresses the duplicate WebSocket report in openai#19581.

## What changed

- Added `codex_utils_string::to_ascii_json_string`, a shared helper that
serializes JSON normally while escaping non-ASCII string content as
`\uXXXX`.
- Switched turn metadata header serialization, including merged
Responses API client metadata, to use the ASCII-safe JSON helper.
- Added coverage for non-ASCII workspace paths and non-ASCII client
metadata while preserving the same parsed JSON values.

## Verification

- `cargo test -p codex-utils-string`
- `cargo test -p codex-core turn_metadata`
- `just bazel-lock-check`

* [mcp] Fix plugin MCP approval policy. (openai#19537)

Plugin MCP servers are loaded from plugin manifests rather than
top-level `[mcp_servers]`, so their tool approval preferences need to be
stored and applied through the owning plugin config. Without this,
choosing "Always allow" for a plugin MCP tool could write a preference
that was not reliably used on later tool calls.

## Summary
- Add plugin-scoped MCP policy config under
`plugins.<plugin>.mcp_servers`, including server enablement, tool
allow/deny lists, server defaults, and per-tool approval modes.
- Overlay plugin MCP policy onto manifest-provided server configs when
plugins are loaded.
- Route persistent "Always allow" writes for plugin MCP tools back to
the owning `plugins.<plugin>.mcp_servers.<server>.tools.<tool>` config
entry.
- Reload user config after persisting an approval and make the plugin
load cache config-aware so stale plugin MCP policy is not reused after
`config.toml` changes.
- Regenerate the config schema and add coverage for plugin MCP policy
loading, approval lookup, persistence, and stale-cache prevention.

## Testing
- `cargo test -p codex-config`
- `cargo test -p codex-core-plugins`
- `cargo test -p codex-core --lib plugin_mcp`

* Add agent graph store interface (openai#19229)

## Summary

Persisted subagent parent/child topology currently leaks through
`StateRuntime`'s SQLite-specific thread-spawn helpers. This PR
introduces a narrow `AgentGraphStore` boundary so follow-up work can
route graph operations through a local or remote store without coupling
orchestration code directly to the state DB graph API.

## Changes

- Adds the new `codex-agent-graph-store` crate.
- Defines a flat `AgentGraphStore` trait for the v1 graph surface:
upsert edge, set edge status, list direct children, and list
descendants.
- Adds public graph types for `ThreadSpawnEdgeStatus`,
`AgentGraphStoreError`, and `AgentGraphStoreResult`.
- Implements `LocalAgentGraphStore` on top of an existing
`codex_state::StateRuntime`, preserving today's SQLite-backed
`thread_spawn_edges` behavior.
- Registers the crate in Cargo/Bazel metadata.

This PR only adds the local contract and implementation; call-site
migration and the remote gRPC store are left to the follow-up PRs in the
stack.

## Testing

- `cargo test -p codex-agent-graph-store`

The new unit tests cover local parity with the existing `StateRuntime`
graph methods, `Open`/`Closed` filtering, status updates, and stable
breadth-first descendant ordering.

---------

Co-authored-by: rhan-oai <rhan@openai.com>
Co-authored-by: Matthew Zeng <mzeng@openai.com>
Co-authored-by: sayan-oai <sayan@openai.com>
Co-authored-by: xli-oai <xli@openai.com>
Co-authored-by: Alex Daley <adaley@openai.com>
Co-authored-by: Michael Bolin <mbolin@openai.com>
Co-authored-by: Eric Traut <etraut@openai.com>
Co-authored-by: Rasmus Rygaard <rasmus@openai.com>
Co-authored-by: Calixto <manoel.calixto.neto@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant