Noriwal AlMa Jr

Noriwal AlMa Jr

Transcribe WhatsApp Audio Messages with Whisper AI via Groq

WhatsApp Audio Transcriber Bot

Overview

Automatically transcribe WhatsApp audio messages to text using AI-powered speech recognition. This workflow receives audio messages via webhook, processes them through Groq's Whisper API, and replies with the transcribed text in the same conversation.

Use Cases

  • Accessibility : Help users with hearing impairments access audio content
  • Workplace Communication : Quickly scan audio messages in professional settings
  • Language Learning : Get text versions of audio for better comprehension
  • Meeting Notes : Convert voice messages to searchable text format
  • Multilingual Support : Transcribe audio in Portuguese (configurable for other languages)

How it Works

  1. Message Reception : Webhook receives WhatsApp messages in real-time
  2. Audio Detection : Filters only audio messages using Switch node
  3. Format Conversion : Converts base64 audio to MP3 file format
  4. AI Transcription : Processes audio through Groq API with Whisper Large V3 model
  5. Response Delivery : Sends transcribed text back to the original conversation

Key Features

  • Real-time Processing : Instant transcription of incoming audio messages
  • High Accuracy : Uses Whisper Large V3 model for reliable transcription
  • Auto-Reply : Automatically responds in the same WhatsApp conversation
  • Message Quoting : References the original audio message in the reply
  • Portuguese Optimized : Configured for Brazilian Portuguese transcription
  • Self-Message Filtering : Ignores messages sent by the bot itself

Prerequisites

Required Services

  • Evolution API : WhatsApp integration service
  • Groq API : AI transcription service (Whisper model)
  • n8n Instance : Workflow automation platform

API Keys & Configuration

  • Groq API key (set as environment variable: GROQ_API_KEY )
  • Evolution API instance properly configured
  • Webhook URL configured in Evolution API

Setup Instructions

  1. Import Workflow : Import the JSON workflow into your n8n instance
  2. Configure Environment : Set GROQ_API_KEY environment variable
  3. Setup Webhook : Configure Evolution API to send messages to the webhook endpoint
  4. Test Connection : Send a test audio message to verify the workflow

Workflow Nodes

  • Webhook : Receives WhatsApp messages from Evolution API
  • Edit Fields : Extracts relevant data (number, name, message, audio)
  • Switch : Filters only audio messages ( audioMessage type)
  • Convert to File : Transforms base64 audio to MP3 format
  • HTTP Request : Sends audio to Groq API for transcription
  • Evolution API : Sends transcribed text back to WhatsApp

Configuration Options

Groq API Settings

  • Model : whisper-large-v3
  • Language : pt (Portuguese)
  • Temperature : 0 (maximum accuracy)
  • Response Format : json

Customization Options

  • Change language by modifying the language parameter
  • Adjust temperature for different accuracy/creativity balance
  • Modify response format for different output styles

Response Format

 *Mensagem transcrita automaticamente.*
 [Transcribed text content]

Technical Specifications

  • Input : Base64 encoded audio from WhatsApp
  • Output : Plain text transcription
  • Processing Time : Typically 2-5 seconds per audio message
  • Supported Audio : MP3 format (converted from WhatsApp audio)
  • Language : Portuguese (configurable)

Troubleshooting

  • No Response : Check Groq API key and webhook configuration
  • Poor Transcription : Ensure audio quality and check language settings
  • Error Messages : Monitor n8n execution logs for detailed error information

Version History

  • v0.0.1 : Initial release with basic transcription functionality
Do you want to automate your business?

Let's talk about your project