WhatsApp Audio Transcriber Bot
Overview
Automatically transcribe WhatsApp audio messages to text using AI-powered speech recognition. This workflow receives audio messages via webhook, processes them through Groq's Whisper API, and replies with the transcribed text in the same conversation.
Use Cases
-
Accessibility : Help users with hearing impairments access audio content
-
Workplace Communication : Quickly scan audio messages in professional settings
-
Language Learning : Get text versions of audio for better comprehension
-
Meeting Notes : Convert voice messages to searchable text format
-
Multilingual Support : Transcribe audio in Portuguese (configurable for other languages)
How it Works
-
Message Reception : Webhook receives WhatsApp messages in real-time
-
Audio Detection : Filters only audio messages using Switch node
-
Format Conversion : Converts base64 audio to MP3 file format
-
AI Transcription : Processes audio through Groq API with Whisper Large V3 model
-
Response Delivery : Sends transcribed text back to the original conversation
Key Features
- ✅ Real-time Processing : Instant transcription of incoming audio messages
- ✅ High Accuracy : Uses Whisper Large V3 model for reliable transcription
- ✅ Auto-Reply : Automatically responds in the same WhatsApp conversation
- ✅ Message Quoting : References the original audio message in the reply
- ✅ Portuguese Optimized : Configured for Brazilian Portuguese transcription
- ✅ Self-Message Filtering : Ignores messages sent by the bot itself
Prerequisites
Required Services
-
Evolution API : WhatsApp integration service
-
Groq API : AI transcription service (Whisper model)
-
n8n Instance : Workflow automation platform
API Keys & Configuration
- Groq API key (set as environment variable:
GROQ_API_KEY
)
- Evolution API instance properly configured
- Webhook URL configured in Evolution API
Setup Instructions
-
Import Workflow : Import the JSON workflow into your n8n instance
-
Configure Environment : Set
GROQ_API_KEY
environment variable
-
Setup Webhook : Configure Evolution API to send messages to the webhook endpoint
-
Test Connection : Send a test audio message to verify the workflow
Workflow Nodes
-
Webhook : Receives WhatsApp messages from Evolution API
-
Edit Fields : Extracts relevant data (number, name, message, audio)
-
Switch : Filters only audio messages (
audioMessage
type)
-
Convert to File : Transforms base64 audio to MP3 format
-
HTTP Request : Sends audio to Groq API for transcription
-
Evolution API : Sends transcribed text back to WhatsApp
Configuration Options
Groq API Settings
-
Model :
whisper-large-v3
-
Language :
pt
(Portuguese)
-
Temperature :
0
(maximum accuracy)
-
Response Format :
json
Customization Options
- Change language by modifying the
language
parameter
- Adjust temperature for different accuracy/creativity balance
- Modify response format for different output styles
Response Format
*Mensagem transcrita automaticamente.*
[Transcribed text content]
Technical Specifications
-
Input : Base64 encoded audio from WhatsApp
-
Output : Plain text transcription
-
Processing Time : Typically 2-5 seconds per audio message
-
Supported Audio : MP3 format (converted from WhatsApp audio)
-
Language : Portuguese (configurable)
Troubleshooting
-
No Response : Check Groq API key and webhook configuration
-
Poor Transcription : Ensure audio quality and check language settings
-
Error Messages : Monitor n8n execution logs for detailed error information
Version History
-
v0.0.1 : Initial release with basic transcription functionality