Skip to content

Voice · Granola

Granola

AI notepad for back-to-back meetings — captures your Mac's audio, no meeting bot.

FREEMIUMCloudmacOSWindowsiOS

A desktop notepad that captures your computer's audio during calls and enhances your rough notes into structured summaries afterward. Because it records the system's own audio rather than dispatching a bot into the meeting, there's no visible notetaker for participants. Includes shared folders, custom templates, and an AI chat over your notes.

Where it runs

  • macOS
  • Windows
  • iOS

Tags

  • #meetings
  • #notetaker
  • #transcription
  • #mac
Open GranolaPricing
  • View ElevenLabs details
    VoiceFREEMIUM

    ElevenLabs

    ElevenLabs

    Frontier TTS, voice cloning, and dubbing. Industry default.

    Hosted speech synthesis at near-human quality — TTS, voice cloning, multilingual dubbing, and conversational voice agents. Default choice when you need a voice that sounds like a person, not a robot.

    AI insight: Set the bar for voice cloning — a usable clone from seconds of reference audio — which is how it became the default TTS.

    • tts
    • voice-cloning
    • dubbing
    • multilingual
  • View AssemblyAI details
    VoiceFREEMIUM

    AssemblyAI

    AssemblyAI

    Production speech-to-text + audio intelligence API.

    Speech recognition API with batch and real-time streaming transcription, speaker diarization, and language detection. Its Universal models pair with optional Speech Understanding features (summarization, sentiment, redaction) so a single API can build conversation-intelligence products. Starts with a free credit and pay-as-you-go, per-second billing.

    AI insight: Bills per second and layers 'Speech Understanding' models — summaries, sentiment, PII redaction — on top of raw transcription.

    • stt
    • transcription
    • streaming
    • audio-intelligence
  • View Fathom details
    VoiceFREEMIUM

    Fathom

    Fathom Video

    Free AI notetaker for Zoom, Meet, and Teams calls.

    Records, transcribes, and summarizes video meetings with AI summaries, action items, and shareable clips. Free for unlimited individual recording; paid Team tiers add advanced AI, analytics, and CRM sync.

    AI insight: Offers unlimited recording and transcription free forever, monetizing team analytics and CRM sync where rivals cap free minutes.

    • meeting-notes
    • transcription
    • summaries
    • notetaker
  • View Fireflies.ai details
    VoiceFREEMIUM

    Fireflies.ai

    Fireflies.ai

    AI notetaker that records, transcribes, and summarizes meetings.

    A meeting assistant whose bot joins Zoom, Google Meet, Microsoft Teams, and Webex calls to record, transcribe, and summarize them, then extracts action items and highlights. The AskFred assistant lets you query across past conversations, and transcripts sync to CRMs and other tools. Mobile and Chrome apps capture in-person and browser audio.

    AI insight: A calendar bot auto-joins Zoom, Meet, and Teams calls to record them, and AskFred answers questions across your past meetings.

    • meeting-notes
    • transcription
    • summaries
    • crm
  • View Hume AI details
    VoiceFREEMIUM

    Hume AI

    Hume AI

    Empathic Voice Interface — speech-to-speech AI that hears tone.

    A voice AI toolkit built around the Empathic Voice Interface (EVI), a speech-to-speech model that infers emotion and prosody from a user's voice and modulates its replies accordingly. Exposed as an API for building expressive voice agents and assistants. From a research lab focused on emotional intelligence in AI.

    AI insight: EVI reads the prosody and emotion in your voice — not just the words — and tunes its own tone and timing in response.

    • voice
    • speech-to-speech
    • emotion
    • api
  • View Otter.ai details
    VoiceFREEMIUM

    Otter.ai

    Otter.ai

    AI meeting notetaker — live transcription, summaries, and AI chat for meetings.

    Records and transcribes meetings in real time with speaker identification, then generates summaries and action items. An AI chat answers questions across your meeting history, and the Otter agent can join calls on Zoom, Google Meet, and Teams to take notes automatically.

    AI insight: Otter's free tier allows only 3 lifetime file imports, steering you toward live meeting capture rather than uploading recorded audio.

    • meetings
    • transcription
    • notetaker
    • speech-to-text
  • View Wispr Flow details
    VoiceFREEMIUM

    Wispr Flow

    Wispr

    AI voice dictation that types for you across every app, on desktop and mobile.

    A dictation tool that turns speech into clean, formatted text in any app — removing filler words and applying context-aware edits as you talk. One subscription works across macOS, Windows, iOS, and Android, syncing your custom vocabulary and snippets between devices.

    AI insight: Spans macOS, Windows, iOS and Android — one of the few AI dictation tools on all four, with a custom dictionary that syncs across them.

    • dictation
    • voice
    • speech-to-text
    • productivity
  • View Cartesia details
    VoiceFREEMIUM

    Cartesia

    Cartesia

    Low-latency streaming TTS. Sub-100ms first audio.

    Streaming-first speech synthesis built around the Sonic family of state-space models. Aims at real-time agent voices where latency between turns is the product. Strong choice for sub-200ms voice loops.

    AI insight: Its Sonic voices run on state-space models rather than transformers — the architectural reason it hits sub-100ms first audio.

    • tts
    • streaming
    • low-latency
    • real-time
  • View Deepgram details
    VoiceFREEMIUM

    Deepgram

    Deepgram

    Production speech-to-text. The STT default for many companies.

    End-to-end speech recognition platform — real-time streaming, batch transcription, speaker diarization, and language detection. Strong on accented speech, telephony audio, and long-form recordings.

    AI insight: Tuned for messy real-world audio — accents, phone lines, overlapping speakers — where general transcribers tend to fall apart.

    • stt
    • transcription
    • streaming
    • diarization
  • View Vapi details
    VoiceFREEMIUM

    Vapi

    Vapi

    Voice agent infrastructure. Build a phone-agent in a weekend.

    Production voice-agent platform — telephony, STT, LLM, TTS, and interrupt handling stitched together so you call an endpoint and get a working phone agent. Pluggable models at every layer.

    AI insight: Solves the hard parts of phone agents — telephony and barge-in interrupts — leaving you to pick the STT, LLM, and TTS layers.

    • voice-agents
    • telephony
    • phone
    • real-time