feat(minimax): add MiniMax provider with tier-aware rate limiting#84
feat(minimax): add MiniMax provider with tier-aware rate limiting#84Societus wants to merge 6 commits intorepowise-dev:mainfrom
Conversation
- Add litellm to interactive provider selection menu - Support LITELLM_BASE_URL for local proxy deployments (no API key required) - Auto-add openai/ prefix when using api_base for proper LiteLLM routing - Add dummy API key for local proxies (OpenAI SDK requirement) - Add validation and tests for litellm provider configuration Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… false positives Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add first-class support for Z.AI with OpenAI-compatible API. - New ZAIProvider with thinking disabled by default for GLM-5 family - Plan selection: 'coding' (subscription) or 'general' (pay-as-you-go) - Environment variables: ZAI_API_KEY, ZAI_PLAN, ZAI_BASE_URL, ZAI_THINKING - Rate limit defaults and auto-detection in CLI helpers Closes repowise-dev#68
Add RATE_LIMIT_TIERS class attribute and resolve_rate_limiter() static method to BaseProvider. Any provider with subscription tiers can define RATE_LIMIT_TIERS and pass tier + tiers to resolve_rate_limiter() to get automatic tier-aware rate limiter creation. Precedence: tier > explicit rate_limiter > None. Tier matching is case-insensitive. Invalid tiers raise ValueError. This is a provider-agnostic foundation -- no provider-specific code. Providers adopt it by defining RATE_LIMIT_TIERS and calling resolve_rate_limiter() in their constructor. Ref: repowise-dev#68
Add MiniMax as a built-in provider using the generic tier framework (repowise-dev#82). MiniMax is an OpenAI-compatible API provider with the M2.x model family (M2.7, M2.5, M2.1, M2) and published token plan rate tiers. Changes: - New MiniMaxProvider with RATE_LIMIT_TIERS (starter/plus/max/ultra) derived from published 5-hour rolling window limits - Uses resolve_rate_limiter() from BaseProvider for tier resolution - reasoning_split=True by default to separate thinking from content - Bumped retry budget: 5 retries / 30s max for load-shedding tolerance - Registered in provider registry with openai package dependency hint - Conservative PROVIDER_DEFAULTS (Starter-tier: 5 RPM / 25K TPM) - CLI env vars: MINIMAX_API_KEY, MINIMAX_BASE_URL, MINIMAX_REASONING_SPLIT, MINIMAX_TIER - 30 unit tests (constructor, tiers, generate, stream_chat, registry) Rate limit tiers (from https://platform.minimax.io/docs/token-plan/intro): Starter: 1,500 req/5hrs -> 5 RPM / 25K TPM Plus: 4,500 req/5hrs -> 15 RPM / 75K TPM Max: 15,000 req/5hrs -> 50 RPM / 250K TPM Ultra: 30,000 req/5hrs -> 100 RPM / 500K TPM Highspeed variants (e.g., MiniMax-M2.7-highspeed) share the same rate limits as their base plan -- the difference is faster inference, not quota. This provider is structurally identical to Z.AI (repowise-dev#83) and was trivial to implement because both use the generic tier framework. The framework eliminated all per-provider boilerplate for tier resolution. Depends on: repowise-dev#82 (generic tier framework) Ref: repowise-dev#68
swati510
left a comment
There was a problem hiding this comment.
missing zai and minimax in providers/llm/init.py, registry.py docstring got updated, it didnt
| console.print(f" [{WARN}]Skipped. Please select another provider.[/]") | ||
| return interactive_provider_select(console, model_flag, repo_path=repo_path) | ||
| # Special case: litellm local proxy doesn't need an API key | ||
| if chosen == "litellm" and os.environ.get("LITELLM_BASE_URL"): |
There was a problem hiding this comment.
this branch is unreachable — _detect_provider_status (L417-420) already marks litellm as detected when LITELLM_BASE_URL is set, so we never enter the outer if chosen not in detected with this combo.
| @@ -268,18 +268,22 @@ def print_phase_header( | |||
| "litellm": "groq/llama-3.1-70b-versatile", | |||
| } | |||
There was a problem hiding this comment.
zai and minimax are wired in helpers.py, validate_provider_config, and the registry but not here , they won't show up in the interactive init menu. please add them to _PROVIDER_DEFAULTS, _PROVIDER_ENV, and _PROVIDER_SIGNUP
| """ | ||
|
|
||
| def __init__( | ||
| self, |
There was a problem hiding this comment.
since this PR introduces the tier framework on BaseProvider, should zai adopt it too? lite/pro/max have published limits. ok to defer but feels odd to land the framework and only wire minimax
swati510
left a comment
There was a problem hiding this comment.
Looks like this is stacked on #83, so the base.py/registry/zai changes are shared. Assuming #83 lands first this is fine, just calling it out.
Three things:
-
My earlier note about _PROVIDER_DEFAULTS / _PROVIDER_ENV / _PROVIDER_SIGNUP in cli/ui.py still stands, zai and minimax are invisible in the interactive init menu. Worth fixing here since this PR ships both.
-
MiniMax rate limits are published as 1500 requests / 5 hours. Our RateLimiter is a 60-second sliding window. Converting to ~5 RPM is a reasonable steady-state approximation but a user who bursts will see spurious 429s locally, and one who paces slowly can technically exceed quota without tripping our limiter. Fine to ship as-is, but leave a comment acknowledging the window mismatch so nobody chases a ghost bug later.
-
MINIMAX_REASONING_SPLIT is parsed as .lower() == "true" in two different branches of helpers.py. Extract a tiny _env_bool helper and accept the usual truthy values (1/yes/on) since that's what users reach for.
| if os.environ.get("MINIMAX_BASE_URL"): | ||
| kwargs["base_url"] = os.environ["MINIMAX_BASE_URL"] | ||
| if os.environ.get("MINIMAX_REASONING_SPLIT"): | ||
| kwargs["reasoning_split"] = os.environ["MINIMAX_REASONING_SPLIT"].lower() == "true" |
There was a problem hiding this comment.
Same .lower() == "true" parsing is copy-pasted at line 357 in the auto-detect path. Extract into _env_bool(name, default=False) and reuse. Also accept 1/yes/on, that's what users will type.
…ta warning, clean dead branch Apologies for the oversight -- these provider dict entries were mostly in place during development but got lost assembling the PR stack. - Add zai and minimax to _PROVIDER_DEFAULTS, _PROVIDER_ENV, and _PROVIDER_SIGNUP so they appear in interactive init - Extract _env_bool(name, default=False) helper accepting 1/yes/on/true and reuse for MINIMAX_REASONING_SPLIT parsing in both code paths - Add session_request_warn to RateLimitConfig: logs a warning when cumulative session requests exceed a threshold, giving users advance notice before hitting long-window provider quotas (e.g. MiniMax's 1500 req/5hr) - Remove unreachable litellm local-proxy branch (L488): _detect_provider_status already marks litellm as detected when LITELLM_BASE_URL is set, so the guard at L483 makes it unreachable - Add note about MiniMax 1500req/5hr vs our 60s window approximation Addresses review feedback from @swati510 on repowise-dev#84.
Summary
Add MiniMax as a built-in LLM provider using the generic tier framework from #82.
This PR is a straightforward application of the same pattern as #83. Both MiniMax and Z.AI are OpenAI-compatible APIs with subscription tiers and built-in reasoning models. The generic tier framework made this provider almost mechanical to implement -- the only provider-specific code is the model names, the
reasoning_splitparameter vs Z.AI'sthinkingtoggle, and the tier definitions.Depends on: #82 (generic tier framework -- merge that first)
Why This Was Inconsequential
MiniMax shares the same architectural profile as Z.AI:
https://api.minimax.io/v1openaiSDKThe generic framework from #82 eliminated all boilerplate for tier resolution. Adding MiniMax was just: define
RATE_LIMIT_TIERS, set the base URL, and pick the reasoning parameter name. Everything else is inherited.Changes
New: MiniMax Provider (
minimax.py)RATE_LIMIT_TIERSwith Starter/Plus/Max/Ultra configs from published limitsresolve_rate_limiter()from BaseProvider (zero custom tier code)reasoning_split=Trueby default (separates thinking from content)Registry (
registry.py)minimax->MiniMaxProviderwithopenaipackage hintRate Limiter (
rate_limiter.py)PROVIDER_DEFAULTS["minimax"]= Starter-tier conservative (5 RPM / 25K TPM)CLI Helpers (
helpers.py)MINIMAX_API_KEY,MINIMAX_BASE_URL,MINIMAX_REASONING_SPLIT,MINIMAX_TIERenv varsMINIMAX_API_KEYTests (
test_minimax_provider.py)Rate Limit Tiers
From published MiniMax docs (5-hour rolling window):
Highspeed variants (e.g., MiniMax-M2.7-highspeed) share the same rate limits as their base plan. The difference is model selection (faster inference), not quota.
Ref: https://platform.minimax.io/docs/token-plan/intro
Configuration
Test Plan
uv run pytest tests/unit/test_providers/test_minimax_provider.py -v # 30 passedAll 121 provider tests pass with zero regressions.
PR Stack
Related