feat: cli OpenAI-compatible API `response_format` support by markstur · Pull Request #884 · generative-computing/mellea

markstur · 2026-04-17T23:37:58Z

Misc PR

Type of PR

Bug Fix
New Feature
Documentation
Other

Description

Link to Issue: Fixes m serve OpenAI API structured output #824

feat: cli OpenAI-compatible API `response_format` support

   - Added `JsonSchemaFormat` model to represent JSON schema definitions
   - Extended `ResponseFormat` to support `json_schema` type (in addition to existing `text` and `json_object`)
   - Used field alias to avoid conflict with Pydantic's `schema` method

   - Added `_json_schema_to_pydantic()` utility function to dynamically convert JSON schemas to Pydantic models
   - Updated `_build_model_options()` to exclude `response_format` from model options (handled separately)
   - Modified `make_chat_endpoint()` to:
     - Parse `response_format` from requests
     - Convert `json_schema` type to Pydantic models using the utility function
     - Detect if the serve function accepts a `format` parameter using `inspect.signature()`
     - Pass the generated Pydantic model as `format=` parameter to serve functions that support it
     - Handle backward compatibility with serve functions that don't accept `format`
   - Added proper error handling for invalid schemas

   - Test json_schema format is converted to Pydantic model and passed to serve
   - Test json_object format doesn't pass a schema
   - Test text format doesn't pass a schema
   - Test error handling for missing json_schema field
   - Test error handling for invalid JSON schemas
   - Test backward compatibility with serve functions without format parameter
   - Test optional fields in JSON schemas

When a client sends a request with `response_format.type = "json_schema"`, the server:
1. Extracts the JSON schema from `response_format.json_schema.schema`
2. Dynamically creates a Pydantic model from the schema
3. Passes it as the `format=` parameter to the serve function
4. The serve function can then use this for constrained decoding via Mellea's `instruct()` method

This maps OpenAI's `response_format` API to Mellea's native `format=` parameter for structured output.

Testing

Tests added to the respective file if code was changed
New code has 100% coverage if code as added
Ensure existing tests and github automation passes (a maintainer will kick off the github automation when the rest of the PR is populated)

Attribution

AI coding assistants used

- Added `JsonSchemaFormat` model to represent JSON schema definitions - Extended `ResponseFormat` to support `json_schema` type (in addition to existing `text` and `json_object`) - Used field alias to avoid conflict with Pydantic's `schema` method - Added `_json_schema_to_pydantic()` utility function to dynamically convert JSON schemas to Pydantic models - Updated `_build_model_options()` to exclude `response_format` from model options (handled separately) - Modified `make_chat_endpoint()` to: - Parse `response_format` from requests - Convert `json_schema` type to Pydantic models using the utility function - Detect if the serve function accepts a `format` parameter using `inspect.signature()` - Pass the generated Pydantic model as `format=` parameter to serve functions that support it - Handle backward compatibility with serve functions that don't accept `format` - Added proper error handling for invalid schemas - Test json_schema format is converted to Pydantic model and passed to serve - Test json_object format doesn't pass a schema - Test text format doesn't pass a schema - Test error handling for missing json_schema field - Test error handling for invalid JSON schemas - Test backward compatibility with serve functions without format parameter - Test optional fields in JSON schemas When a client sends a request with `response_format.type = "json_schema"`, the server: 1. Extracts the JSON schema from `response_format.json_schema.schema` 2. Dynamically creates a Pydantic model from the schema 3. Passes it as the `format=` parameter to the serve function 4. The serve function can then use this for constrained decoding via Mellea's `instruct()` method This maps OpenAI's `response_format` API to Mellea's native `format=` parameter for structured output. Signed-off-by: Mark Sturdevant <[email protected]>

Signed-off-by: Mark Sturdevant <[email protected]>

github-actions · 2026-04-17T23:38:14Z

The PR description has been updated. Please fill out the template for your PR to be reviewed.

markstur added 3 commits April 17, 2026 13:03

feat: add response_format support in cli when streaming

79f327f

Signed-off-by: Mark Sturdevant <[email protected]>

feat: cli response_format features adding doc examples

af7e905

Signed-off-by: Mark Sturdevant <[email protected]>

markstur requested a review from a team as a code owner April 17, 2026 23:37

markstur requested review from jakelorocco and planetf1 April 17, 2026 23:37

markstur changed the title ~~Issue 824~~ feat: cli OpenAI-compatible API response_format support Apr 17, 2026

github-actions bot added the enhancement New feature or request label Apr 17, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: cli OpenAI-compatible API `response_format` support#884

feat: cli OpenAI-compatible API `response_format` support#884
markstur wants to merge 3 commits intogenerative-computing:mainfrom
markstur:issue_824

markstur commented Apr 17, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Apr 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

markstur commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Misc PR

Type of PR

Description

Testing

Attribution

Uh oh!

github-actions bot commented Apr 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

markstur commented Apr 17, 2026 •

edited

Loading