Skip to content

fix: convert qq_official voice messages to wav#6944

Open
1zzxy1 wants to merge 2 commits intoAstrBotDevs:masterfrom
1zzxy1:fix/qq-official-voice-wav-conversion
Open

fix: convert qq_official voice messages to wav#6944
1zzxy1 wants to merge 2 commits intoAstrBotDevs:masterfrom
1zzxy1:fix/qq-official-voice-wav-conversion

Conversation

@1zzxy1
Copy link
Contributor

@1zzxy1 1zzxy1 commented Mar 25, 2026

Summary

  • detect whether outgoing qq_official voice payloads are already WAV by reading the file header
  • automatically convert non-WAV audio (for example MP3 from TTS plugins) to WAV before Tencent Silk conversion
  • clean up temporary converted WAV files and add regression tests for both conversion and non-conversion paths

Testing

  • uv run --group dev pytest tests/test_qqofficial_message_event.py -q
  • uv run --group dev ruff check astrbot/core/platform/sources/qqofficial/qqofficial_message_event.py tests/test_qqofficial_message_event.py

Closes #6509

Summary by Sourcery

Handle QQ Official voice messages that are not already WAV by converting them before Silk encoding and cleaning up temporary files.

Bug Fixes:

  • Ensure non-WAV voice message payloads are converted to WAV before Tencent Silk encoding to prevent failures when sending audio.
  • Avoid unnecessary reconversion of voice messages that are already in WAV format.

Enhancements:

  • Add detection of WAV audio files via header inspection in QQOfficialMessageEvent.
  • Automatically remove temporary WAV files created during voice message conversion.

Tests:

  • Add async tests covering both conversion of non-WAV voice messages and the passthrough path for existing WAV inputs in QQOfficialMessageEvent.

@auto-assign auto-assign bot requested review from Soulter and advent259141 March 25, 2026 10:48
@dosubot dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Mar 25, 2026
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the QQ Official message event handling by introducing robust audio format management. It ensures that all voice messages are in the required WAV format before being converted to Tencent Silk, preventing potential issues with unsupported audio types and improving the reliability of voice message delivery. The changes include detection, conversion, and proper cleanup of temporary files, all backed by new test cases.

Highlights

  • Audio Format Detection: Implemented logic to detect if outgoing QQ Official voice payloads are already in WAV format by inspecting the file header.
  • Automatic Audio Conversion: Added functionality to automatically convert non-WAV audio files (e.g., MP3 from TTS plugins) to WAV format before they are processed for Tencent Silk conversion.
  • Temporary File Cleanup: Ensured that any temporary WAV files created during the conversion process are properly cleaned up after use.
  • Regression Tests: Introduced new regression tests to cover both the audio conversion path and the scenario where conversion is skipped for already WAV files.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@dosubot dosubot bot added the area:platform The bug / feature is about IM platform adapter, such as QQ, Lark, Telegram, WebChat and so on. label Mar 25, 2026
Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've reviewed your changes and they look great!


Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request enhances the QQOfficialMessageEvent to support non-WAV audio records by converting them to WAV format before processing them into Tencent Silk. It introduces a new utility method _is_wav_audio_file and includes tests for both WAV and non-WAV audio handling. The review comments suggest converting the newly added file I/O operations (os.path.exists, os.remove, and open()) to their asynchronous aiofiles.os counterparts to prevent blocking the asyncio event loop.

Comment on lines +638 to +646
if converted_record_wav_path and os.path.exists(
converted_record_wav_path
):
try:
os.remove(converted_record_wav_path)
except OSError as e:
logger.warning(
f"[QQOfficial] failed to remove converted audio file: {e}"
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This block uses synchronous file operations (os.path.exists and os.remove) which can block the asyncio event loop. Please use the asynchronous versions from aiofiles.os. You will need to import aiofiles.os.

Suggested change
if converted_record_wav_path and os.path.exists(
converted_record_wav_path
):
try:
os.remove(converted_record_wav_path)
except OSError as e:
logger.warning(
f"[QQOfficial] failed to remove converted audio file: {e}"
)
if converted_record_wav_path and await aiofiles.os.path.exists(
converted_record_wav_path
):
try:
await aiofiles.os.remove(converted_record_wav_path)
except OSError as e:
logger.warning(
f"[QQOfficial] failed to remove converted audio file: {e}"
)

Comment on lines +675 to +682
@staticmethod
def _is_wav_audio_file(file_path: str) -> bool:
try:
with open(file_path, "rb") as f:
header = f.read(12)
except OSError:
return False
return len(header) >= 12 and header[:4] == b"RIFF" and header[8:12] == b"WAVE"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The _is_wav_audio_file method performs synchronous file I/O using open(), which can block the asyncio event loop. Since aiofiles is already used in the project, it's better to use it here for non-blocking I/O. This method should be converted to an async method, and the call site updated with await.

Suggested change
@staticmethod
def _is_wav_audio_file(file_path: str) -> bool:
try:
with open(file_path, "rb") as f:
header = f.read(12)
except OSError:
return False
return len(header) >= 12 and header[:4] == b"RIFF" and header[8:12] == b"WAVE"
@staticmethod
async def _is_wav_audio_file(file_path: str) -> bool:
try:
async with aiofiles.open(file_path, "rb") as f:
header = await f.read(12)
except OSError:
return False
return len(header) >= 12 and header[:4] == b"RIFF" and header[8:12] == b"WAVE"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:platform The bug / feature is about IM platform adapter, such as QQ, Lark, Telegram, WebChat and so on. size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]QQ官方机器人无法发送语言

1 participant