In provider/openai.go:119, the responses route reads the full body into memory with io.ReadAll then unmarshals with json.Unmarshal. This should be migrated to use json.NewDecoder for consistency with the chat completions route and to avoid the intermediate allocation.
Note: the raw payload bytes are currently forwarded to the responses interceptors, so the migration isn't a direct swap — the interceptors would need to be updated to not depend on the raw payload.
A scaletest should be done before and after the change to assess the memory/performance impact and confirm it's worth the refactor.