Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0c8f41eab9
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| let status = match response.scope { | ||
| PermissionGrantScope::Turn => ToolReviewStatus::Approved, | ||
| PermissionGrantScope::Session => ToolReviewStatus::ApprovedForSession, | ||
| }; |
There was a problem hiding this comment.
Classify empty permission grants as denied reviews
PermissionsRequestApproval maps status from scope only, so Turn is always emitted as approved. But UI denial paths return an empty permissions profile with scope=Turn (see approval_overlay::handle_permissions_decision). That causes denied/timed-out permission reviews to be reported as approvals and denormalized into incorrect final_approval_outcome values on tool item events.
Useful? React with 👍 / 👎.
| let started = self.guardian_review_starts.remove(¬ification.review_id); | ||
| let created_at = started | ||
| .as_ref() | ||
| .map(|start| start.created_at) | ||
| .unwrap_or(completed_at); | ||
| let Some(status) = guardian_review_status(notification.review.status) else { |
There was a problem hiding this comment.
Deduplicate guardian review completion notifications
This handler emits a review even when guardian_review_starts.remove(review_id) returns None. Because notifications are tracked per connection, multi-client threads ingest the same completion once per connection, creating duplicate codex_tool_call_review_event records and inflating per-item review counters.
Useful? React with 👍 / 👎.
f61bae4 to
dfed0b4
Compare
6158941 to
edf6a86
Compare
507ec65 to
7a1a231
Compare
ec96f8b to
1fc452a
Compare
565e39a to
66f2072
Compare
6617848 to
535ea75
Compare
## Why Several analytics event families need the same per-thread attribution state: the app-server client/runtime associated with a thread and, for lifecycle-oriented events, the thread metadata captured during initialization. Keeping connection ids and lifecycle metadata in separate maps made each consumer rebuild the same thread context and made subagent attribution harder to resolve consistently. ## What changed - Replaces the separate thread connection and metadata maps with one reducer-owned `threads` map. - Routes guardian, compaction, turn-steer, and turn analytics through shared thread-state lookups while preserving turn-origin attribution for turn events and request-origin attribution for steer events. - Lets newly observed spawned subagent threads inherit their parent thread connection so later thread-scoped analytics can resolve through the same state model. - Adds regression coverage for standalone `SubAgentThreadStarted` publication plus the `SubAgentSource::ThreadSpawn` parent fallback through a thread-scoped consumer that depends on inherited connection state. ## Verification - `cargo test -p codex-analytics` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/20300). * #18748 * #18747 * #17090 * #17089 * #20239 * #20515 * #20514 * __->__ #20300
a51cdee to
e70e29a
Compare
* [codex-analytics] ingest server requests and responses (openai#17088) ## Why Codex analytics needs a typed seam for app-server-originated request/response traffic so future tool-approval analytics can consume those facts without adding bespoke callsite tracking each time. Server responses arrive as JSON-RPC `id + result` payloads, so analytics has to reconstruct the matching typed response from the original typed request while that request context still exists in app-server. This also puts analytics on the app-server outbound path, which needs to avoid keeping the runtime alive during shutdown. The final ownership fix keeps the normal strong auth-manager retention in analytics and makes the external-auth refresh bridge hold a weak back-reference to `OutgoingMessageSender`, breaking the runtime cycle at the bridge boundary instead of exposing retention policy through the analytics client API. ## What changed - Adds typed `ServerRequest` and `ServerResponse` analytics facts, plus `AnalyticsEventsClient::track_server_request` and `track_server_response`. - Renames the existing client-side facts to `ClientRequest` and `ClientResponse` so reducers can distinguish client-to-server traffic from server-to-client traffic. - Adds `ServerRequest::response_from_result`, allowing a stored typed request to decode the matching typed server response from a raw JSON-RPC result payload. - Threads `AnalyticsEventsClient` through `OutgoingMessageSender` and records targeted server requests, replayed targeted requests, and matching targeted responses with the responding connection id needed for correlation. - Intentionally leaves broadcast server requests/responses out of analytics for now because the current model is per connection, while broadcasts fan one logical request out across multiple connections. - Breaks the app-server shutdown cycle by storing `Weak<OutgoingMessageSender>` in `ExternalAuthRefreshBridge` and upgrading it only when an external-auth refresh is actually requested. - Keeps reducer ingestion of the new server-side facts as no-ops for now; this PR is plumbing for later tool-approval analytics work. ## Verification - `cargo test -p codex-analytics` - `cargo test -p codex-app-server outgoing_message::tests::` - Covers typed-response reconstruction plus the targeted, replayed, broadcast-exclusion, and response-attribution analytics paths. ## Follow-up This PR intentionally stops at ingestion plumbing, so `ServerRequest` and `ServerResponse` facts are still reducer no-ops. Once a follow-up PR adds real downstream analytics output for those facts: - replace the temporary pre-reducer observation seam with reducer tests for the emitted event shape; - add end-to-end coverage in `app-server/tests/suite/v2/analytics.rs` for the real app-server workflow and captured analytics payload; - remove the temporary sender-level observer tests added here in favor of the real-output coverage above. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/17088). * openai#18748 * openai#18747 * openai#17090 * openai#17089 * openai#20241 * openai#20239 * __->__ openai#17088 * [tool_suggest] Improve tool_suggest triggering conditions. (openai#20091) ## Summary - Tighten `tool_suggest` guidance so it prefers explicit plugin install requests, while still allowing a connector install when the relevant plugin is already installed and a needed connector from that plugin is missing. - Tell the model not to call `tool_suggest` in parallel with other tools. ## Testing - `cargo test -p codex-tools tool_suggest` - `cargo test -p codex-core tool_suggest` * app-server: fix outgoing sender test setup (openai#20258) ## Why [openai#17088](openai#17088) changed `OutgoingMessageSender::new` to require an `AnalyticsEventsClient`, but one `command_exec` test added earlier on `main` still called the old one-argument constructor. That leaves current `main` failing to compile in Bazel and argument-comment-lint jobs. ## What changed - Pass `AnalyticsEventsClient::disabled()` to the missed `OutgoingMessageSender::new` test call site in `command_exec.rs`. ## Verification - `cargo test -p codex-app-server timeout_or_cancellation_reports_cancellation_without_timeout_exit_code` * [app-server] type client response payloads (openai#20050) ## Why `pr17088` adds typed server-originated request/response plumbing, but successful client responses are still erased into bare JSON-RPC `result` values before app-server can make any typed decision about them. This precursor PR keeps successful client responses typed until the outgoing response seam. It is intentionally limited to protocol/app-server plumbing so the analytics behavior change can review separately on top. ## What changed - Add `ClientResponsePayload` as the pre-serialization client response body type. - Route app-server successful response paths through the typed payload seam while preserving existing handler-local analytics behavior. - Keep `InterruptConversation` JSON-RPC-only because it has no `ClientResponse` variant. - Move the new payload conversion tests into a dedicated protocol test module. ## Verification - `cargo check -p codex-app-server` - `cargo test -p codex-app-server-protocol` * Require remote plugin detail before uninstall (openai#19966) ## Summary - Fetch remote plugin detail before sending the uninstall request. - Use the detail response to derive the marketplace namespace and plugin name for cache cleanup. - Stop the uninstall before the backend POST if detail lookup fails, so backend state and local cache state do not diverge. ## Testing - `just fmt` - `cargo test -p codex-app-server plugin_uninstall` - `cargo test -p codex-core-plugins` - `git diff --check` * [app-server] centralize client response analytics (openai#20059) ## Why The precursor PR keeps successful client responses typed until app-server's outgoing response seam. This follow-up uses that seam to move successful client-response analytics out of individual handlers and into the shared sender path, while keeping filtering decisions inside `codex-analytics`. ## What changed - Emit successful client-response analytics centrally from `OutgoingMessageSender::send_response`. - Remove duplicate handler-local response tracking for the current thread/turn lifecycle responses. - Keep analytics ingestion selective inside `AnalyticsEventsClient`, so unrelated client traffic is ignored before cloning or boxing. - Collapse client-response analytics facts onto one typed path and normalize payloads in the reducer. - Add direct client-filter coverage plus sender-level coverage for the centralized forwarding path. ## Verification - `cargo test -p codex-analytics` - `cargo test -p codex-app-server outgoing_message::tests --lib` * Fallback login callback port when default is busy (openai#19334) ## Summary - Keep the preferred ChatGPT login callback port `1455` first. - Preserve the existing `/cancel` recovery for stale Codex login servers. - Fall back to the registered localhost callback port `1457` when `1455` remains unavailable. ## Why Cursor and Codex Desktop both use the ChatGPT account login callback server. On Windows, Cursor can already be listening on `127.0.0.1:1455` / `[::1]:1455`, causing Codex Desktop sign-in to fail with: `Local callback port 1455 is already in use on this machine.` Codex already attempted to cancel a stale Codex login server on that port, but if the listener does not release the port, the old behavior was to fail. The new behavior falls back to `1457`, which matches the fixed redirect URI being registered server-side in `openai/openai#863817`. This keeps the OAuth `redirect_uri` inside Hydra's exact allow-list instead of choosing an arbitrary ephemeral port. ## Validation - `just fmt` - `cargo test -p codex-login` - `git diff --check HEAD~1..HEAD` * [apps] Add apps MCP path override (openai#20231) Summary - Add `[features.apps_mcp_path_override]` config with a `path` field for overriding only the built-in apps MCP path. - Keep existing host/base URL derivation unchanged and append the configured path after that base. - Regenerate the config schema with the custom feature-config case. Test Plan - Not run for latest revision; only `just fmt` and `just write-config-schema` were run. - Earlier revision: `cargo test -p codex-features` - Earlier revision: `cargo test -p codex-mcp` * docs: discourage `#[async_trait]` and `#[allow(async_fn_in_trait)]` (openai#20242) ## Why We have run into two avoidable problems when introducing async trait APIs in Rust: - `#[async_trait]` has caused materially worse build times in this repository. - `#[allow(async_fn_in_trait)]` makes it too easy to ship a public trait without spelling out whether the returned future is `Send`, which hides an important part of the trait contract. We already have a good example of the preferred alternative in [openai#16630](openai#16630) / [`3c7f013f9735`](openai@3c7f013f9735), but that guidance currently lives only as prior art in the codebase. This PR documents the rule in `AGENTS.md` so contributors are more likely to follow the native RPITIT pattern before these two shortcuts spread further. ## What Changed - added Rust guidance in `AGENTS.md` discouraging both `#[async_trait]` and `#[allow(async_fn_in_trait)]` - pointed contributors to the native RPITIT pattern with explicit `Send` bounds on the returned future - clarified that implementations may still use `async fn` when they satisfy that trait contract ## Verification - docs-only change; no tests run * Escape turn metadata headers as ASCII JSON (openai#19620) ## Why `x-codex-turn-metadata` is sent as an HTTP/WebSocket header, but Codex was serializing the metadata JSON with raw UTF-8 string contents. When a workspace path contains non-ASCII characters, common HTTP stacks can reject or corrupt that header before the request reaches the provider. Fixes openai#17468. Also addresses the duplicate WebSocket report in openai#19581. ## What changed - Added `codex_utils_string::to_ascii_json_string`, a shared helper that serializes JSON normally while escaping non-ASCII string content as `\uXXXX`. - Switched turn metadata header serialization, including merged Responses API client metadata, to use the ASCII-safe JSON helper. - Added coverage for non-ASCII workspace paths and non-ASCII client metadata while preserving the same parsed JSON values. ## Verification - `cargo test -p codex-utils-string` - `cargo test -p codex-core turn_metadata` - `just bazel-lock-check` * [mcp] Fix plugin MCP approval policy. (openai#19537) Plugin MCP servers are loaded from plugin manifests rather than top-level `[mcp_servers]`, so their tool approval preferences need to be stored and applied through the owning plugin config. Without this, choosing "Always allow" for a plugin MCP tool could write a preference that was not reliably used on later tool calls. ## Summary - Add plugin-scoped MCP policy config under `plugins.<plugin>.mcp_servers`, including server enablement, tool allow/deny lists, server defaults, and per-tool approval modes. - Overlay plugin MCP policy onto manifest-provided server configs when plugins are loaded. - Route persistent "Always allow" writes for plugin MCP tools back to the owning `plugins.<plugin>.mcp_servers.<server>.tools.<tool>` config entry. - Reload user config after persisting an approval and make the plugin load cache config-aware so stale plugin MCP policy is not reused after `config.toml` changes. - Regenerate the config schema and add coverage for plugin MCP policy loading, approval lookup, persistence, and stale-cache prevention. ## Testing - `cargo test -p codex-config` - `cargo test -p codex-core-plugins` - `cargo test -p codex-core --lib plugin_mcp` * Add agent graph store interface (openai#19229) ## Summary Persisted subagent parent/child topology currently leaks through `StateRuntime`'s SQLite-specific thread-spawn helpers. This PR introduces a narrow `AgentGraphStore` boundary so follow-up work can route graph operations through a local or remote store without coupling orchestration code directly to the state DB graph API. ## Changes - Adds the new `codex-agent-graph-store` crate. - Defines a flat `AgentGraphStore` trait for the v1 graph surface: upsert edge, set edge status, list direct children, and list descendants. - Adds public graph types for `ThreadSpawnEdgeStatus`, `AgentGraphStoreError`, and `AgentGraphStoreResult`. - Implements `LocalAgentGraphStore` on top of an existing `codex_state::StateRuntime`, preserving today's SQLite-backed `thread_spawn_edges` behavior. - Registers the crate in Cargo/Bazel metadata. This PR only adds the local contract and implementation; call-site migration and the remote gRPC store are left to the follow-up PRs in the stack. ## Testing - `cargo test -p codex-agent-graph-store` The new unit tests cover local parity with the existing `StateRuntime` graph methods, `Open`/`Closed` filtering, status updates, and stable breadth-first descendant ordering. --------- Co-authored-by: rhan-oai <rhan@openai.com> Co-authored-by: Matthew Zeng <mzeng@openai.com> Co-authored-by: sayan-oai <sayan@openai.com> Co-authored-by: xli-oai <xli@openai.com> Co-authored-by: Alex Daley <adaley@openai.com> Co-authored-by: Michael Bolin <mbolin@openai.com> Co-authored-by: Eric Traut <etraut@openai.com> Co-authored-by: Rasmus Rygaard <rasmus@openai.com> Co-authored-by: Calixto <manoel.calixto.neto@gmail.com>
Stack created with Sapling. Best reviewed with ReviewStack.