fix: append /v1 for OpenAI embedding api base by idiotsj · Pull Request #6910 · AstrBotDevs/AstrBot

idiotsj · 2026-03-24T17:22:32Z

Summary

normalize embedding_api_base for the OpenAI embedding provider
append /v1 automatically when the configured base URL omits it
add regression tests for both missing and existing /v1 suffixes

Problem

When embedding_api_base is configured without /v1, the OpenAI embedding client uses the raw base URL and requests fail against OpenAI-compatible endpoints that expect the /v1 prefix.

Closes #6855

Testing

uv run pytest tests/test_openai_source.py -q
uv run ruff format .
uv run ruff check .

Summary by Sourcery

Normalize OpenAI embedding API base URL handling and add regression coverage for various base URL configurations.

Bug Fixes:

Ensure OpenAI embedding provider automatically appends /v1 to root-style embedding_api_base values that omit the version segment.
Avoid introducing duplicate slashes and preserve versioned or path-specific embedding API base URLs, including query strings and fragments.
Fall back to the default OpenAI embedding base URL when embedding_api_base is blank in the configuration.

Tests:

Add unit tests covering normalization of embedding_api_base, including missing or existing /v1, trailing slashes, blank values, versioned paths, and URLs with query/fragment components.

gemini-code-assist · 2026-03-24T17:22:51Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request resolves an issue where the OpenAI embedding client would fail to make requests to OpenAI-compatible endpoints if the configured embedding_api_base URL did not explicitly include the /v1 API version suffix. The changes introduce a robust URL normalization step that automatically appends /v1 when necessary, ensuring consistent and successful API calls.

Highlights

API Base Normalization: Implemented a mechanism to normalize the embedding_api_base for the OpenAI embedding provider.
Automatic /v1 Appending: Ensured that /v1 is automatically appended to the embedding_api_base if it's missing, preventing request failures.
Comprehensive Testing: Added regression tests to verify the correct behavior for both cases: when /v1 is missing and when it's already present.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

sourcery-ai

Hey - I've found 1 issue, and left some high level feedback:

The _normalize_embedding_api_base helper assumes that any base URL not ending with /v1 should have /v1 appended, which will produce incorrect URLs for already-versioned or path-specific bases like .../v1-beta or .../v1/embeddings; consider narrowing the normalization condition (e.g., only when the path does not already start with /v or when the host matches OpenAI) or documenting that only root-style bases are supported.
When embedding_api_base is configured as whitespace only, strip() leaves an empty string that bypasses normalization and is passed through to AsyncOpenAI; consider treating an empty string the same as a missing value and falling back to the default base URL instead.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- The `_normalize_embedding_api_base` helper assumes that any base URL not ending with `/v1` should have `/v1` appended, which will produce incorrect URLs for already-versioned or path-specific bases like `.../v1-beta` or `.../v1/embeddings`; consider narrowing the normalization condition (e.g., only when the path does not already start with `/v` or when the host matches OpenAI) or documenting that only root-style bases are supported.
- When `embedding_api_base` is configured as whitespace only, `strip()` leaves an empty string that bypasses normalization and is passed through to `AsyncOpenAI`; consider treating an empty string the same as a missing value and falling back to the default base URL instead.

## Individual Comments

### Comment 1
<location path="tests/test_openai_source.py" line_range="69-71" />
<code_context>
+    )
+
+
 @pytest.mark.asyncio
 async def test_handle_api_error_content_moderated_removes_images():
     provider = _make_provider(
</code_context>
<issue_to_address>
**suggestion (testing):** Cover edge case where `embedding_api_base` ends with a trailing slash but no `/v1`

Given the current normalization (`rstrip('/')` then conditionally appending `/v1`), please add a test for `embedding_api_base="https://example.com/openai/"` to assert it becomes `https://example.com/openai/v1/` and not `https://example.com/openai//v1/`, so the trailing-slash handling is locked in.

Suggested implementation:

```python
    finally:
        await provider.terminate()


def test_embedding_api_base_trailing_slash_normalized():
    provider = _make_provider(
        overrides={"embedding_api_base": "https://example.com/openai/"}
    )

    # The provider should normalize the embedding API base by removing any
    # trailing slash and then appending `/v1`, resulting in a single slash.
    # This asserts we do *not* end up with `https://example.com/openai//v1/`.
    base_url = str(provider.client._client.base_url)
    assert base_url == "https://example.com/openai/v1/"

```

Depending on how `OpenAIEmbeddingProvider` exposes its underlying OpenAI client, you may need to adjust the attribute chain used to read the base URL:

- If the provider exposes the client as `provider._client` instead of `provider.client`, change `provider.client._client.base_url` to `provider._client.base_url` or `provider._client._client.base_url`.
- If the base URL is stored on a different attribute (e.g. `provider._client.base_url` or `provider.client.base_url`), update the test accordingly while keeping the assertion value `https://example.com/openai/v1/`.

The key behavior to lock in is that `"https://example.com/openai/"` normalizes to `"https://example.com/openai/v1/"` without a double slash.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

sourcery-ai · 2026-03-24T17:24:20Z

tests/test_openai_source.py

 @pytest.mark.asyncio
 async def test_handle_api_error_content_moderated_removes_images():
    provider = _make_provider(


suggestion (testing): Cover edge case where embedding_api_base ends with a trailing slash but no /v1

Given the current normalization (rstrip('/') then conditionally appending /v1), please add a test for embedding_api_base="https://example.com/openai/" to assert it becomes https://example.com/openai/v1/ and not https://example.com/openai//v1/, so the trailing-slash handling is locked in.

Suggested implementation:

finally: await provider.terminate() def test_embedding_api_base_trailing_slash_normalized(): provider = _make_provider( overrides={"embedding_api_base": "https://example.com/openai/"} ) # The provider should normalize the embedding API base by removing any # trailing slash and then appending `/v1`, resulting in a single slash. # This asserts we do *not* end up with `https://example.com/openai//v1/`. base_url = str(provider.client._client.base_url) assert base_url == "https://example.com/openai/v1/"

Depending on how OpenAIEmbeddingProvider exposes its underlying OpenAI client, you may need to adjust the attribute chain used to read the base URL:

If the provider exposes the client as provider._client instead of provider.client, change provider.client._client.base_url to provider._client.base_url or provider._client._client.base_url.

If the base URL is stored on a different attribute (e.g. provider._client.base_url or provider.client.base_url), update the test accordingly while keeping the assertion value https://example.com/openai/v1/.

The key behavior to lock in is that "https://example.com/openai/" normalizes to "https://example.com/openai/v1/" without a double slash.

gemini-code-assist

Code Review

This pull request introduces a new static method, _normalize_embedding_api_base, within the OpenAIEmbeddingProvider to ensure that API base URLs consistently end with /v1. This normalization is applied during the initialization of the AsyncOpenAI client. New unit tests have been added to validate this behavior. A review comment points out a potential issue where an empty or whitespace-only embedding_api_base configuration could result in an empty api_base being passed to the client, leading to an InvalidURL error. It suggests a more robust approach by falling back to the default URL before normalization in such cases.

gemini-code-assist · 2026-03-24T17:28:01Z

astrbot/core/provider/sources/openai_embedding_source.py

+        if api_base:
+            api_base = self._normalize_embedding_api_base(api_base)


While the current logic correctly normalizes a non-empty api_base, it doesn't handle cases where api_base becomes an empty string (e.g., if the configuration provides an empty or whitespace-only value). This will cause the AsyncOpenAI client to fail with an InvalidURL error. To make this more robust, we should ensure we fall back to the default URL if api_base is empty before normalizing.

api_base = self._normalize_embedding_api_base(api_base or "https://api.openai.com/v1")

idiotsj · 2026-03-24T17:30:40Z

@sourcery-ai review

sourcery-ai

Hey - I've found 1 issue

Prompt for AI Agents

Please address the comments from this code review:

## Individual Comments

### Comment 1
<location path="astrbot/core/provider/sources/openai_embedding_source.py" line_range="29" />
<code_context>
+        ``https://example.com`` or ``https://example.com/openai``. More specific
+        paths (for example ``/v1-beta`` or ``/v1/embeddings``) are preserved as-is.
+        """
+        normalized_api_base = api_base.rstrip("/")
+        parsed = urlsplit(normalized_api_base)
+        path_segments = [segment for segment in parsed.path.split("/") if segment]
</code_context>
<issue_to_address>
**issue (bug_risk):** Avoid applying `rstrip('/')` to the whole URL string to prevent altering query/fragment parts.

Because `api_base` may include query or fragment components (e.g. `https://example.com?next=/foo/`), calling `rstrip('/')` on the full string will incorrectly change those parts (`next=/foo/` → `next=/foo`). Instead, parse the URL (`parsed = urlsplit(api_base)`) and apply `rstrip('/')` only to `parsed.path`, then rebuild the URL with `parsed._replace(path=...)` so query and fragment remain intact.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

astrbot/core/provider/sources/openai_embedding_source.py

idiotsj · 2026-03-24T17:39:34Z

@sourcery-ai review

sourcery-ai

Hey - I've reviewed your changes and they look great!

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

fix: append /v1 for openai embedding api base

3ccc4cb

auto-assign bot requested review from LIghtJUNction and Raven95676 March 24, 2026 17:22

dosubot bot added the size:M This PR changes 30-99 lines, ignoring generated files. label Mar 24, 2026

dosubot bot added the area:provider The bug / feature is about AI Provider, Models, LLM Agent, LLM Agent Runner. label Mar 24, 2026

sourcery-ai bot reviewed Mar 24, 2026

View reviewed changes

gemini-code-assist bot reviewed Mar 24, 2026

View reviewed changes

fix: refine embedding api base normalization

df64f4a

dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:M This PR changes 30-99 lines, ignoring generated files. labels Mar 24, 2026

sourcery-ai bot reviewed Mar 24, 2026

View reviewed changes

astrbot/core/provider/sources/openai_embedding_source.py Outdated Show resolved Hide resolved

fix: preserve query and fragment in embedding api base

8319921

sourcery-ai bot reviewed Mar 24, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: append /v1 for OpenAI embedding api base#6910

fix: append /v1 for OpenAI embedding api base#6910
idiotsj wants to merge 3 commits intoAstrBotDevs:masterfrom
idiotsj:fix/issue-6855-embedding-v1

idiotsj commented Mar 24, 2026 •

edited by sourcery-ai bot

Loading

Uh oh!

gemini-code-assist bot commented Mar 24, 2026

Uh oh!

sourcery-ai bot left a comment

Uh oh!

sourcery-ai bot Mar 24, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Mar 24, 2026

Uh oh!

idiotsj commented Mar 24, 2026

Uh oh!

sourcery-ai bot left a comment

Uh oh!

Uh oh!

idiotsj commented Mar 24, 2026

Uh oh!

sourcery-ai bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		if api_base:
		api_base = self._normalize_embedding_api_base(api_base)

Uh oh!

Conversation

idiotsj commented Mar 24, 2026 • edited by sourcery-ai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Testing

Summary by Sourcery

Uh oh!

gemini-code-assist bot commented Mar 24, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

idiotsj commented Mar 24, 2026

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

idiotsj commented Mar 24, 2026

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

idiotsj commented Mar 24, 2026 •

edited by sourcery-ai bot

Loading