A smart, voice-enabled Telegram bot powered by the Gemini 2.5 Flash model. This bot understands your voice messages, processes them with cutting-edge AI, and replies with both text and audio!
- Voice-to-Text & Text-to-Voice: Speak directly to the bot. It understands audio inputs and talks back to you!
- Powered by Gemini 2.5 Flash: Super fast, contextual, and intelligent responses.
- Smart Voice Trigger: To save resources, the bot replies with audio only when the response is long.
- Ultra-Lightweight: Strictly optimized to run efficiently on ultra-low-spec hardware.
The biggest challenge of this project was memory constraint. The bot is hosted on a minimal server with only 512 MB RAM and 1 vCPU.
To make it stable, I optimized the code by:
- Implementing asynchronous execution (
asyncio) to prevent blocking the single-core CPU. - Streamlining audio file processing to prevent memory leaks.
- Keeping dependencies minimal to maintain a low RAM footprint.