Background
MiniMax TTS API currently supports output formats: \mp3, \wav, \lac, \pcm. However, \opus\ (and \ogg) is missing, which creates friction when integrating with messaging platforms.
Use Case
Many messaging platforms natively support voice messages using the opus codec:
- Feishu/Lark: Sends .opus/.ogg\ files as native audio voice bubbles (\msg_type: audio), but treats .mp3\ as file attachments that require download before playback.
- Telegram: Opus is the native format for voice messages, enabling waveform preview and instant playback.
- WhatsApp, Discord: Also prefer opus for voice messages.
Without opus support at the TTS output level, developers need an additional ffmpeg conversion step (\mp3 → opus), which:
- Adds latency (~2-5s per conversion)
- Requires an external dependency (ffmpeg)
- Complicates serverless / lightweight deployment scenarios
Proposal
Add \opus\ as a supported value for the \�udio_setting.format\ parameter in both HTTP and WebSocket TTS APIs.
Preferred implementation:
- Direct opus output from the TTS pipeline (no post-conversion)
- Support in both sync (HTTP/WebSocket) and async TTS endpoints
Minimum viable:
- Even container-level mp3→opus conversion would be helpful if it saves external dependency
API References
- HTTP: \POST /v1/t2a_v2\ — \�udio_setting.format\ currently accepts: mp3, wav, flac, pcm
- WebSocket: \wss://api.minimaxi.com/v1/t2a_v2\ — same \�udio_setting.format\ parameter
Additional Context
This would significantly improve the developer experience for anyone building chatbots, AI assistants, or voice-enabled agents that need to deliver TTS results as native voice messages on modern messaging platforms.
Background
MiniMax TTS API currently supports output formats: \mp3, \wav, \lac, \pcm. However, \opus\ (and \ogg) is missing, which creates friction when integrating with messaging platforms.
Use Case
Many messaging platforms natively support voice messages using the opus codec:
Without opus support at the TTS output level, developers need an additional ffmpeg conversion step (\mp3 → opus), which:
Proposal
Add \opus\ as a supported value for the \�udio_setting.format\ parameter in both HTTP and WebSocket TTS APIs.
Preferred implementation:
Minimum viable:
API References
Additional Context
This would significantly improve the developer experience for anyone building chatbots, AI assistants, or voice-enabled agents that need to deliver TTS results as native voice messages on modern messaging platforms.