Skip to content

fix: auto-append /v1 suffix to embedding_api_base in OpenAI embedding provider#6912

Closed
BillionClaw wants to merge 1 commit intoAstrBotDevs:masterfrom
BillionClaw:fix/6887-openai-embedding-v1-suffix
Closed

fix: auto-append /v1 suffix to embedding_api_base in OpenAI embedding provider#6912
BillionClaw wants to merge 1 commit intoAstrBotDevs:masterfrom
BillionClaw:fix/6887-openai-embedding-v1-suffix

Conversation

@BillionClaw
Copy link
Contributor

@BillionClaw BillionClaw commented Mar 24, 2026

Summary

Fixes: #6887

PR #6669 removed the automatic /v1 suffix from embedding_api_base because some non-OpenAI providers don't use standard /v1/embeddings endpoints. However, this broke OpenAI-compatible providers (e.g., text-embedding-3-large on SiliconFlow) that require the /v1 path. When /v1 is absent, the SDK sends requests to https://api.openai.com/embeddings instead of https://api.openai.com/v1/embeddings, returning an error string. Code then tries to call .data on that string, causing 'str' object has no attribute 'data'.

Modifications

  • astrbot/core/provider/sources/openai_embedding_source.py: If embedding_api_base is set but does not end with /v1 or /v1/, automatically append /v1.
  • tests/unit/test_openai_embedding_source.py: Unit tests covering all API base URL variants.

Verification

  • This is NOT a breaking change. Existing configs with /v1 are unaffected.

Summary by Sourcery

Restore correct handling of the OpenAI embedding API base URL by normalizing it to include the /v1 suffix when appropriate and add coverage for this behavior.

Bug Fixes:

  • Ensure embedding_api_base automatically appends a /v1 suffix when missing so OpenAI-compatible embedding endpoints are called with the correct path.

Tests:

  • Add unit tests verifying that various embedding_api_base configurations correctly normalize to endpoints with or without the /v1 suffix as expected.

@auto-assign auto-assign bot requested review from Fridemn and anka-afk March 24, 2026 22:20
@dosubot dosubot bot added the size:M This PR changes 30-99 lines, ignoring generated files. label Mar 24, 2026
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request resolves a critical regression that affected OpenAI-compatible embedding providers. A previous change had inadvertently removed the automatic appending of the /v1 suffix to the embedding_api_base, leading to incorrect API calls and subsequent errors. The current fix reintroduces this essential logic, ensuring that the embedding_api_base is correctly formatted with /v1 when necessary, thereby restoring functionality for these providers and preventing runtime exceptions.

Highlights

  • Issue Fix: Fixed a regression where OpenAI-compatible embedding providers failed due to missing /v1 suffix in embedding_api_base.
  • API Base Logic: Implemented logic to automatically append /v1 to embedding_api_base if it's not already present, ensuring correct API endpoint construction.
  • Test Coverage: Added comprehensive unit tests to validate the new embedding_api_base suffix handling for various scenarios.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@dosubot dosubot bot added the area:provider The bug / feature is about AI Provider, Models, LLM Agent, LLM Agent Runner. label Mar 24, 2026
Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 1 issue

Prompt for AI Agents
Please address the comments from this code review:

## Individual Comments

### Comment 1
<location path="tests/unit/test_openai_embedding_source.py" line_range="64-37" />
<code_context>
+        provider = _make_provider({"embedding_api_base": "https://openai.example.com/v1"})
+        assert provider.client.base_url == "https://openai.example.com/v1"
+
+    def test_empty_api_base_uses_default(self) -> None:
+        """Empty api_base should use the default OpenAI endpoint."""
+        provider = _make_provider({"embedding_api_base": ""})
+        assert provider.client.base_url == "https://api.openai.com/v1"
+
+    def test_default_api_base_is_unchanged(self) -> None:
</code_context>
<issue_to_address>
**suggestion (testing):** Add a test case for `embedding_api_base` with surrounding whitespace

Since the constructor applies `.strip()` to `embedding_api_base`, it would be useful to add a test that passes values with leading/trailing spaces (e.g. `' https://api.openai.com '` or `' https://api.openai.com/ '`) and asserts that `provider.client.base_url` is normalized to `https://api.openai.com/v1`.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

"""api_base like 'https://api.openai.com' should become 'https://api.openai.com/v1'."""
provider = _make_provider({"embedding_api_base": "https://api.openai.com"})
# The provider should auto-append /v1
assert provider.client.base_url == "https://api.openai.com/v1"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Add a test case for embedding_api_base with surrounding whitespace

Since the constructor applies .strip() to embedding_api_base, it would be useful to add a test that passes values with leading/trailing spaces (e.g. ' https://api.openai.com ' or ' https://api.openai.com/ ') and asserts that provider.client.base_url is normalized to https://api.openai.com/v1.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request reintroduces logic to automatically append the /v1 suffix to the OpenAI API base URL if it's not already present, addressing a regression. New unit tests have been added to cover various scenarios for this API base URL formatting. A high-severity issue was identified where the current implementation does not correctly handle an explicitly provided empty embedding_api_base, which would result in an invalid API base URL instead of defaulting to https://api.openai.com/v1.

Comment on lines +30 to +31
if api_base and not api_base.endswith("/v1") and not api_base.endswith("/v1/"):
api_base = api_base.rstrip("/") + "/v1"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The current logic for handling api_base does not correctly fall back to the default https://api.openai.com/v1 when embedding_api_base is explicitly provided as an empty string. If provider_config.get("embedding_api_base") returns an empty string, the if api_base condition evaluates to False, skipping the logic to append /v1. This results in self.client.base_url being set to an empty string, which is likely an invalid URL for AsyncOpenAI and would cause connection errors.

The unit test test_empty_api_base_uses_default in tests/unit/test_openai_embedding_source.py correctly identifies the desired behavior (falling back to the default), but the current implementation will cause that test to fail. The proposed change ensures that an empty api_base (after stripping whitespace) is treated as if it were not provided, and thus defaults to https://api.openai.com/v1.

        if not api_base:
            api_base = "https://api.openai.com/v1"
        elif not api_base.endswith("/v1") and not api_base.endswith("/v1/"):
            api_base = api_base.rstrip("/") + "/v1"

@Soulter Soulter closed this Mar 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:provider The bug / feature is about AI Provider, Models, LLM Agent, LLM Agent Runner. size:M This PR changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

openai_embedding模型在AstrBot4.21和4.22版本下无法通过测试并出现报错

2 participants