Voice · Otter.ai

Otter.ai

AI meeting notetaker — live transcription, summaries, and AI chat for meetings.

FREEMIUMCloudWebiOSAndroid

Records and transcribes meetings in real time with speaker identification, then generates summaries and action items. An AI chat answers questions across your meeting history, and the Otter agent can join calls on Zoom, Google Meet, and Teams to take notes automatically.

Where it runs

Web
iOS
Android

Tags

#meetings
#transcription
#notetaker
#speech-to-text

Open Otter.ai Pricing

View ElevenLabs details
VoiceFREEMIUM
ElevenLabs
ElevenLabs
Frontier TTS, voice cloning, and dubbing. Industry default.
Hosted speech synthesis at near-human quality — TTS, voice cloning, multilingual dubbing, and conversational voice agents. Default choice when you need a voice that sounds like a person, not a robot.
AI insight: Set the bar for voice cloning — a usable clone from seconds of reference audio — which is how it became the default TTS.
- tts
- voice-cloning
- dubbing
- multilingual
Open
View AssemblyAI details
VoiceFREEMIUM
AssemblyAI
AssemblyAI
Production speech-to-text + audio intelligence API.
Speech recognition API with batch and real-time streaming transcription, speaker diarization, and language detection. Its Universal models pair with optional Speech Understanding features (summarization, sentiment, redaction) so a single API can build conversation-intelligence products. Starts with a free credit and pay-as-you-go, per-second billing.
AI insight: Bills per second and layers 'Speech Understanding' models — summaries, sentiment, PII redaction — on top of raw transcription.
- stt
- transcription
- streaming
- audio-intelligence
Open
View Fathom details
VoiceFREEMIUM
Fathom
Fathom Video
Free AI notetaker for Zoom, Meet, and Teams calls.
Records, transcribes, and summarizes video meetings with AI summaries, action items, and shareable clips. Free for unlimited individual recording; paid Team tiers add advanced AI, analytics, and CRM sync.
AI insight: Offers unlimited recording and transcription free forever, monetizing team analytics and CRM sync where rivals cap free minutes.
- meeting-notes
- transcription
- summaries
- notetaker
Open
View Fireflies.ai details
VoiceFREEMIUM
Fireflies.ai
Fireflies.ai
AI notetaker that records, transcribes, and summarizes meetings.
A meeting assistant whose bot joins Zoom, Google Meet, Microsoft Teams, and Webex calls to record, transcribe, and summarize them, then extracts action items and highlights. The AskFred assistant lets you query across past conversations, and transcripts sync to CRMs and other tools. Mobile and Chrome apps capture in-person and browser audio.
AI insight: A calendar bot auto-joins Zoom, Meet, and Teams calls to record them, and AskFred answers questions across your past meetings.
- meeting-notes
- transcription
- summaries
- crm
Open
View Granola details
VoiceFREEMIUM
Granola
Granola
AI notepad for back-to-back meetings — captures your Mac's audio, no meeting bot.
A desktop notepad that captures your computer's audio during calls and enhances your rough notes into structured summaries afterward. Because it records the system's own audio rather than dispatching a bot into the meeting, there's no visible notetaker for participants. Includes shared folders, custom templates, and an AI chat over your notes.
AI insight: Granola records your computer's own audio instead of sending a bot into the call, so nothing shows up as a notetaker in the meeting.
- meetings
- notetaker
- transcription
- mac
Open
View Hume AI details
VoiceFREEMIUM
Hume AI
Hume AI
Empathic Voice Interface — speech-to-speech AI that hears tone.
A voice AI toolkit built around the Empathic Voice Interface (EVI), a speech-to-speech model that infers emotion and prosody from a user's voice and modulates its replies accordingly. Exposed as an API for building expressive voice agents and assistants. From a research lab focused on emotional intelligence in AI.
AI insight: EVI reads the prosody and emotion in your voice — not just the words — and tunes its own tone and timing in response.
- voice
- speech-to-speech
- emotion
- api
Open
View Wispr Flow details
VoiceFREEMIUM
Wispr Flow
Wispr
AI voice dictation that types for you across every app, on desktop and mobile.
A dictation tool that turns speech into clean, formatted text in any app — removing filler words and applying context-aware edits as you talk. One subscription works across macOS, Windows, iOS, and Android, syncing your custom vocabulary and snippets between devices.
AI insight: Spans macOS, Windows, iOS and Android — one of the few AI dictation tools on all four, with a custom dictionary that syncs across them.
- dictation
- voice
- speech-to-text
- productivity
Open
View Cartesia details
VoiceFREEMIUM
Cartesia
Cartesia
Low-latency streaming TTS. Sub-100ms first audio.
Streaming-first speech synthesis built around the Sonic family of state-space models. Aims at real-time agent voices where latency between turns is the product. Strong choice for sub-200ms voice loops.
AI insight: Its Sonic voices run on state-space models rather than transformers — the architectural reason it hits sub-100ms first audio.
- tts
- streaming
- low-latency
- real-time
Open
View Deepgram details
VoiceFREEMIUM
Deepgram
Deepgram
Production speech-to-text. The STT default for many companies.
End-to-end speech recognition platform — real-time streaming, batch transcription, speaker diarization, and language detection. Strong on accented speech, telephony audio, and long-form recordings.
AI insight: Tuned for messy real-world audio — accents, phone lines, overlapping speakers — where general transcribers tend to fall apart.
- stt
- transcription
- streaming
- diarization
Open
View Vapi details
VoiceFREEMIUM
Vapi
Vapi
Voice agent infrastructure. Build a phone-agent in a weekend.
Production voice-agent platform — telephony, STT, LLM, TTS, and interrupt handling stitched together so you call an endpoint and get a working phone agent. Pluggable models at every layer.
AI insight: Solves the hard parts of phone agents — telephony and barge-in interrupts — leaving you to pick the STT, LLM, and TTS layers.
- voice-agents
- telephony
- phone
- real-time
Open

Open Otter.ai

ElevenLabs

AssemblyAI

Fathom

Fireflies.ai

Granola

Hume AI

Wispr Flow

Cartesia

Deepgram

Vapi