feat:OpenAPI refactor: shared schemas, EvalRunOutputItemResult, remove -2#227
feat:OpenAPI refactor: shared schemas, EvalRunOutputItemResult, remove -2#227
Conversation
WalkthroughRefactors and expands OpenAPI spec in src/libs/tryAGI.OpenAI/openapi.yaml by adding shared schemas/enums, updating references to use them, introducing EvalRunOutputItemResult, extending ScoreModelGrader sampling_params, updating examples/descriptions, and removing legacy “-2” schema variants. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor Client
participant API as Eval Runs API
participant Grader as Grader Engine
Client->>API: POST /eval-runs/{id}/execute
API->>Grader: Evaluate output item(s)
Grader-->>API: Result(s) per item (EvalRunOutputItemResult[])
API-->>Client: 200 OK with results and usage (SearchContextSize)
note over API,Grader: Sampling params may include max_completions_tokens and reasoning_effort
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 5
🧹 Nitpick comments (3)
src/libs/tryAGI.OpenAI/openapi.yaml (3)
9979-9982: Avoid null-only schema; align with chosen OAS version.
type: 'null'plusnullable: trueis tool-fragile (OAS 3.0 disallowstype: null; OAS 3.1 deprecatesnullable). If the field is truly always null, drop it. If it’s optional, model as a union.Option A (remove property if always null):
- status: - type: 'null' - nullable: trueOption B (union; OAS 3.1 style):
- status: - type: 'null' - nullable: true + status: + anyOf: + - { type: 'null' } + - { type: string }Run a quick validation with your chosen linter to confirm compatibility.
18410-18413: Minor: description spacing can confuse formatters.The multi-line string contains irregular spacing; non-blocking.
- description: "Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format, and querying for objects via API or the dashboard.\n Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters." + description: "Set of up to 16 key–value pairs attached to an object. Useful for storing additional structured metadata and querying via API or dashboard. Keys: max 64 chars. Values: max 512 chars."
13876-13881: Consolidate duplicate enums — remove DetailEnum and reuse ImageDetail.DetailEnum duplicates ImageDetail (low/high/auto). Remove DetailEnum from the OpenAPI spec, update any $ref to use ImageDetail, then regenerate the client code to remove the generated DetailEnum artifacts.
Definitions/targets: src/libs/tryAGI.OpenAI/openapi.yaml:13876 (DetailEnum) and src/libs/tryAGI.OpenAI/openapi.yaml:16102 (ImageDetail). Generated files referencing DetailEnum: src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.DetailEnum.g.cs, src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.DetailEnum*.g.cs, and entries in JsonSerializerContextTypes.g.cs.
Proposed removal:
- DetailEnum: - enum: - - low - - high - - auto - type: string
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (99)
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI..JsonSerializerContext.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.ConversationsClient.UpdateConversation.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.IConversationsClient.UpdateConversation.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.Annotation2.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.ComputerCallOutputItemParamStatus.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.ComputerCallOutputItemParamStatusNullable.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.ComputerEnvironment1.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.ComputerEnvironment1Nullable.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.ComputerUsePreviewToolEnvironment.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.ComputerUsePreviewToolEnvironmentNullable.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.ContainerFileCitationBody2Type.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.ContainerFileCitationBody2TypeNullable.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.ContentItem.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.DetailEnum.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.DetailEnumNullable.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.FileCitationBody2Type.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.FunctionCallItemStatus.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.FunctionCallItemStatusNullable.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.ImageDetail.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.ImageDetailNullable.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.InputFileContent2Type.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.InputFileContent2TypeNullable.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.InputImageContent2Detail.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.InputImageContent2DetailNullable.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.InputImageContent2Type.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.InputImageContent2TypeNullable.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.InputImageContentDetail.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.InputImageContentDetailNullable.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.OutputTextContent2Type.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.OutputTextContent2TypeNullable.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.RankerVersionType.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.RankerVersionTypeNullable.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.SearchContextSize.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.SearchContextSizeNullable.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.WebSearchPreviewToolSearchContextSize.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.WebSearchPreviewToolSearchContextSizeNullable.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonSerializerContextTypes.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.Annotation2.Json.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.Annotation2.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.Annotation2Discriminator.Json.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.Annotation2Discriminator.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.ComputerCallOutputItemParam.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.ComputerCallOutputItemParamStatus.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.ComputerEnvironment1.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.ComputerScreenshotContent.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.ComputerUsePreviewTool.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.ContainerFileCitationBody2.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.ContainerFileCitationBody2Type.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.ContentItem.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.ConversationItem.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.CreateEvalCompletionsRunDataSourceSamplingParams.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.CreateEvalResponsesRunDataSourceSamplingParams.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.DetailEnum.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.EvalRunOutputItem.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.EvalRunOutputItemResult.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.EvalRunOutputItemResultSample.Json.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.EvalRunOutputItemResultSample.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.FileCitationBody2.Json.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.FileCitationBody2.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.FunctionCallItemStatus.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.FunctionCallOutputItemParam.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.GraderScoreModel.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.GraderScoreModelSamplingParams.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.ImageDetail.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.InputFileContent2.Json.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.InputFileContent2.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.InputFileContent2Type.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.InputImageContent.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.InputImageContent2.Json.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.InputImageContent2.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.InputImageContent2Type.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.InputTextContent2.Json.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.InputTextContent2.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.InputTextContent2Type.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.LogProb2.Json.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.LogProb2.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.Message.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.MessageRole.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.MessageStatus.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.MetadataParam.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.OutputTextContent2.Json.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.OutputTextContent2.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.OutputTextContent2Type.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RankerVersionType.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RankingOptions.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RefusalContent2.Json.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RefusalContent2.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RefusalContent2Type.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.SearchContextSize.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.SummaryTextContent.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.TextContent.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.TopLogProb2.Json.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.TopLogProb2.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.UpdateConversationBody.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.UrlCitationBody2.Json.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.UrlCitationBody2.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.UrlCitationBody2Type.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.WebSearchPreviewTool.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.WebSearchPreviewToolSearchContextSize.g.csis excluded by!**/generated/**
📒 Files selected for processing (1)
src/libs/tryAGI.OpenAI/openapi.yaml(23 hunks)
🔇 Additional comments (16)
src/libs/tryAGI.OpenAI/openapi.yaml (16)
10037-10037: LGTM: clearer description.
10174-10176: LGTM: shared enum ref for environment reduces drift.
11434-11435: LGTM: add reasoning_effort alongside max_completion_tokens.
11711-11712: LGTM: mirrored reasoning_effort here too.
14631-14632: LGTM: results now reference a reusable result schema.
15663-15667: LGTM on introducing a status enum; ensure usages match.Confirm all
statusproperties that should use this enum actually reference it (see Line 15690 below).
16102-16107: LGTM: ImageDetail enum centralizes detail values.
16686-16686: LGTM: refer to ImageDetail in InputFileContent.
17772-17782: LGTM: Message uses shared role/status enums; nice cleanup and docs.
18304-18320: LGTM: Public MessageRole/MessageStatus look sane; keep values stable.Adding/removing enum values is breaking for many clients—treat changes here as versioned.
Consider documenting stability guarantees in release notes.
19432-19441: LGTM: externalize ranker versions; document stability of dated identifiers.Add a note on how
default-2024-11-15evolves and whetherautois recommended for long-lived clients.
28135-28140: LGTM: SearchContextSize enum clarifies allowed values.
28323-28323: LGTM: clearer summary description.
28351-28351: LGTM: clearer text content description.
30076-30079: LGTM: reference shared SearchContextSize.
17757-17765: LGTM — verification passed
- No $ref to "#/components/schemas/*-2" was found.
- RefusalContent, InputTextContent, OutputTextContent, InputImageContent, and InputFileContent are each defined exactly once.
| ComputerEnvironment1: | ||
| enum: | ||
| - windows | ||
| - mac | ||
| - linux | ||
| - ubuntu | ||
| - browser | ||
| type: string |
There was a problem hiding this comment.
Ubuntu vs Linux: adding a new enum value can break clients.
Including both linux and ubuntu introduces overlap and may be a breaking change for strict consumers. Recommend omitting ubuntu or documenting/versioning this change.
Apply:
ComputerEnvironment1:
enum:
- windows
- mac
- - linux
- - ubuntu
+ - linux
- browser
type: stringCommittable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In src/libs/tryAGI.OpenAI/openapi.yaml around lines 10006 to 10013, the new enum
value "ubuntu" overlaps with existing "linux" and can break strict clients;
remove the "ubuntu" enum entry from the schema (or alternatively revert this
change and add a clear semver-major version bump and changelog note if you must
keep it), update any example/usage docs to reflect the retained values, and add
a release note documenting the enum change and the required client
impact/versioning decision.
| EvalRunOutputItemResult: | ||
| title: EvalRunOutputItemResult | ||
| required: | ||
| - name | ||
| - score | ||
| - passed | ||
| type: object | ||
| properties: | ||
| name: | ||
| type: string | ||
| description: The name of the grader. | ||
| passed: | ||
| type: boolean | ||
| description: Whether the grader considered the output a pass. | ||
| sample: | ||
| type: object | ||
| description: Optional sample or intermediate data produced by the grader. | ||
| nullable: true | ||
| score: | ||
| type: number | ||
| description: The numeric score produced by the grader. | ||
| type: | ||
| type: string | ||
| description: 'The grader type (for example, "string-check-grader").' | ||
| description: "A single grader result for an evaluation run output item.\n" |
There was a problem hiding this comment.
Result schema includes “sample” which duplicates parent; shape mismatch with example.
EvalRunOutputItemResult.sample conflicts with the top-level output_item.sample shown in the example. Keep sample only at one level (prefer parent), or clearly differentiate (e.g., debug_sample).
Apply:
EvalRunOutputItemResult:
title: EvalRunOutputItemResult
required:
- name
- score
- passed
type: object
properties:
name:
type: string
description: The name of the grader.
passed:
type: boolean
description: Whether the grader considered the output a pass.
- sample:
- type: object
- description: Optional sample or intermediate data produced by the grader.
- nullable: true
score:
type: number
description: The numeric score produced by the grader.
type:
type: string
description: 'The grader type (for example, "string-check-grader").'📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| EvalRunOutputItemResult: | |
| title: EvalRunOutputItemResult | |
| required: | |
| - name | |
| - score | |
| - passed | |
| type: object | |
| properties: | |
| name: | |
| type: string | |
| description: The name of the grader. | |
| passed: | |
| type: boolean | |
| description: Whether the grader considered the output a pass. | |
| sample: | |
| type: object | |
| description: Optional sample or intermediate data produced by the grader. | |
| nullable: true | |
| score: | |
| type: number | |
| description: The numeric score produced by the grader. | |
| type: | |
| type: string | |
| description: 'The grader type (for example, "string-check-grader").' | |
| description: "A single grader result for an evaluation run output item.\n" | |
| EvalRunOutputItemResult: | |
| title: EvalRunOutputItemResult | |
| required: | |
| - name | |
| - score | |
| - passed | |
| type: object | |
| properties: | |
| name: | |
| type: string | |
| description: The name of the grader. | |
| passed: | |
| type: boolean | |
| description: Whether the grader considered the output a pass. | |
| score: | |
| type: number | |
| description: The numeric score produced by the grader. | |
| type: | |
| type: string | |
| description: 'The grader type (for example, "string-check-grader").' |
🤖 Prompt for AI Agents
In src/libs/tryAGI.OpenAI/openapi.yaml around lines 14764 to 14788, the schema
defines EvalRunOutputItemResult.sample which duplicates the parent
output_item.sample and the example shows a different shape; remove the nested
sample field or rename it (e.g., debug_sample) and update its schema and example
to match the new name or eliminate it so only the parent sample remains; update
any $ref/usages and the example objects to reflect the single canonical sample
location, and run schema validation to ensure shapes match the example.
| type: 'null' | ||
| nullable: true |
There was a problem hiding this comment.
Status modeled as null conflicts with new enum.
If FunctionCallItemStatus (Lines 15663–15667) is authoritative, allow union with null rather than null-only.
- status:
- type: 'null'
- nullable: true
+ status:
+ anyOf:
+ - $ref: '#/components/schemas/FunctionCallItemStatus'
+ - { type: 'null' }Align with your OAS version (use nullable only if staying on 3.0).
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| type: 'null' | |
| nullable: true | |
| status: | |
| anyOf: | |
| - $ref: '#/components/schemas/FunctionCallItemStatus' | |
| - { type: 'null' } |
🤖 Prompt for AI Agents
In src/libs/tryAGI.OpenAI/openapi.yaml around lines 15690-15691, the schema
currently models a null-only status which conflicts with the
FunctionCallItemStatus enum defined at lines 15663–15667; update the schema to
allow the enum values or null (i.e., a union) instead of null-only—if your
OpenAPI version is 3.1+, represent this as a oneOf/anyOf with the enum schema
and a {"type":"null"}, or if you remain on 3.0.x use the nullable: true flag on
the enum schema so the enum can also be null; ensure the final representation
matches the file's OAS version.
| properties: | ||
| max_completions_tokens: | ||
| minimum: 1 | ||
| type: integer | ||
| description: "The maximum number of tokens the grader model may generate in its response.\n" | ||
| nullable: true | ||
| reasoning_effort: | ||
| $ref: '#/components/schemas/ReasoningEffort' | ||
| seed: | ||
| type: integer | ||
| description: "A seed value to initialize the randomness, during sampling.\n" | ||
| nullable: true | ||
| temperature: | ||
| type: number | ||
| description: "A higher temperature increases randomness in the outputs.\n" | ||
| nullable: true | ||
| top_p: | ||
| type: number | ||
| description: "An alternative to temperature for nucleus sampling; 1.0 includes all tokens.\n" | ||
| default: 1 | ||
| nullable: true | ||
| example: 1 | ||
| description: The sampling parameters for the model. |
There was a problem hiding this comment.
Inconsistent property name: max_completions_tokens vs max_completion_tokens.
Elsewhere you use max_completion_tokens. This pluralization drift will break clients.
sampling_params:
type: object
properties:
- max_completions_tokens:
+ max_completion_tokens:
minimum: 1
type: integer
description: "The maximum number of tokens the grader model may generate in its response.\n"
nullable: true
reasoning_effort:
$ref: '#/components/schemas/ReasoningEffort'Add a deprecated alias only if you must support both for a transition.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| properties: | |
| max_completions_tokens: | |
| minimum: 1 | |
| type: integer | |
| description: "The maximum number of tokens the grader model may generate in its response.\n" | |
| nullable: true | |
| reasoning_effort: | |
| $ref: '#/components/schemas/ReasoningEffort' | |
| seed: | |
| type: integer | |
| description: "A seed value to initialize the randomness, during sampling.\n" | |
| nullable: true | |
| temperature: | |
| type: number | |
| description: "A higher temperature increases randomness in the outputs.\n" | |
| nullable: true | |
| top_p: | |
| type: number | |
| description: "An alternative to temperature for nucleus sampling; 1.0 includes all tokens.\n" | |
| default: 1 | |
| nullable: true | |
| example: 1 | |
| description: The sampling parameters for the model. | |
| sampling_params: | |
| type: object | |
| properties: | |
| max_completion_tokens: | |
| minimum: 1 | |
| type: integer | |
| description: "The maximum number of tokens the grader model may generate in its response.\n" | |
| nullable: true | |
| reasoning_effort: | |
| $ref: '#/components/schemas/ReasoningEffort' |
🤖 Prompt for AI Agents
In src/libs/tryAGI.OpenAI/openapi.yaml around lines 15971 to 15993, the schema
uses the inconsistent property name max_completions_tokens; change the canonical
property to max_completion_tokens (singular "completion") and update all
references/examples to that name, and if you must support the old name during a
transition add max_completions_tokens as an alias property that has the exact
same schema type/constraints but is marked deprecated (add deprecated: true and
a description pointing users to max_completion_tokens) so both parse the same
value while signaling clients to migrate.
| description: "A ScoreModelGrader object that uses a model to assign a score to the input.\n" | ||
| x-oaiMeta: | ||
| example: "{\n \"type\": \"score_model\",\n \"name\": \"Example score model grader\",\n \"input\": [\n {\n \"role\": \"user\",\n \"content\": (\n \"Score how close the reference answer is to the model answer. Score 1.0 if they are the same and 0.0 if they are different.\"\n \" Return just a floating point score\\n\\n\"\n \" Reference answer: {{item.label}}\\n\\n\"\n \" Model answer: {{sample.output_text}}\"\n ),\n }\n ],\n \"model\": \"gpt-4o-2024-08-06\",\n \"sampling_params\": {\n \"temperature\": 1,\n \"top_p\": 1,\n \"seed\": 42,\n },\n}\n" | ||
| example: "{\n \"type\": \"score_model\",\n \"name\": \"Example score model grader\",\n \"input\": [\n {\n \"role\": \"user\",\n \"content\": (\n \"Score how close the reference answer is to the model answer. Score 1.0 if they are the same and 0.0 if they are different.\"\n \" Return just a floating point score\\n\\n\"\n \" Reference answer: {{item.label}}\\n\\n\"\n \" Model answer: {{sample.output_text}}\"\n ),\n }\n ],\n \"model\": \"o4-mini-2025-04-16\",\n \"sampling_params\": {\n \"temperature\": 1,\n \"top_p\": 1,\n \"seed\": 42,\n \"max_completions_tokens\": 32768,\n \"reasoning_effort\": \"medium\"\n },\n}\n" |
There was a problem hiding this comment.
🧩 Analysis chain
Fix example to match the canonical key and verify model id.
Update example key to max_completion_tokens. Also verify o4-mini-2025-04-16 is the correct, GA model identifier.
- "max_completions_tokens": 32768,
+ "max_completion_tokens": 32768,🌐 Web query:
Is "o4-mini-2025-04-16" the correct, currently available OpenAI model id for scoring/graders?
💡 Result:
Short answer: No.
Details: o4-mini-2025-04-16 is a real OpenAI model released April 16, 2025. However Microsoft/Azure's grader (score_model) docs list supported grader model ids as gpt-4o-2024-08-06 and o3-mini-2025-01-31 — they do not list o4-mini-2025-04-16 as a supported grader. [1][2]
Sources:
- OpenAI announcement for o3 and o4-mini (release date Apr 16, 2025). [1]
- Azure OpenAI / Azure AI Foundry docs showing supported grader models (gpt-4o-2024-08-06, o3-mini-2025-01-31). [2]
Use canonical key and replace unsupported grader model
File: src/libs/tryAGI.OpenAI/openapi.yaml — update example key and replace the grader model id: Azure's grader docs list gpt-4o-2024-08-06 and o3-mini-2025-01-31 as supported; o4-mini-2025-04-16 is not listed.
- "max_completions_tokens": 32768,
+ "max_completion_tokens": 32768,📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| example: "{\n \"type\": \"score_model\",\n \"name\": \"Example score model grader\",\n \"input\": [\n {\n \"role\": \"user\",\n \"content\": (\n \"Score how close the reference answer is to the model answer. Score 1.0 if they are the same and 0.0 if they are different.\"\n \" Return just a floating point score\\n\\n\"\n \" Reference answer: {{item.label}}\\n\\n\"\n \" Model answer: {{sample.output_text}}\"\n ),\n }\n ],\n \"model\": \"o4-mini-2025-04-16\",\n \"sampling_params\": {\n \"temperature\": 1,\n \"top_p\": 1,\n \"seed\": 42,\n \"max_completions_tokens\": 32768,\n \"reasoning_effort\": \"medium\"\n },\n}\n" | |
| "max_completion_tokens": 32768, |
🤖 Prompt for AI Agents
In src/libs/tryAGI.OpenAI/openapi.yaml around line 16002, the example uses a
non-canonical API key placeholder and an unsupported grader model id; replace
the example key with the project's canonical placeholder (e.g., OPENAI_API_KEY
or the established project key name) and swap the grader model id to one of
Azure's supported models such as gpt-4o-2024-08-06 or o3-mini-2025-01-31 in the
example, ensuring the value and model id match the documented Azure grader
options.
Summary by CodeRabbit
New Features
Refactor
Documentation