Voice ai

Gemini Flash Live Edges GPT-4 Realtime in Voice AI Race

Google's Gemini 3.1 Flash Live beats GPT-4 Realtime 1.5 on Scale AI's Audio MultiChallenge and takes Search Live to 200+ countries - but it doesn't lead every benchmark.

Cohere's Open-Source Transcribe Tops ASR Leaderboard

Cohere releases its first audio model - a 2B-parameter open-source ASR system beating Whisper Large v3 by 27% on the HuggingFace Open ASR Leaderboard.

Voxtral TTS Review: Mistral Takes On ElevenLabs

Mistral's first open-weights TTS model clones voices from 3 seconds of audio, beats ElevenLabs on price, and arrives with real limitations worth knowing.

Mistral Ships Voxtral - Open-Weights Voice AI Platform

Mistral releases Voxtral, a pair of open-weights models covering speech recognition and text-to-speech that undercut OpenAI and ElevenLabs on price.

Tencent's 7B Voice AI Targets OpenAI's Realtime API

Tencent open-sources Covo-Audio, a 7B end-to-end audio language model with native full-duplex conversation that beats larger closed models on key benchmarks.

Best AI Models for Voice and Speech - March 2026

ElevenLabs Scribe v2 leads speech-to-text at 2.3% WER while ElevenLabs Flash v2.5 sets the pace for TTS with 75ms latency - but Google and Mistral are closing in fast.

How to Set Up an AI Voice Agent From Scratch

A practical guide to building an AI voice agent using platforms like Vapi, Retell, and LiveKit - covering architecture, setup steps, and cost estimates.

AI Voice and Speech Leaderboard: TTS and STT Rankings

Rankings of the best text-to-speech and speech-to-text AI models by naturalness, accuracy, latency, and pricing.

PersonaPlex 7B Runs Full-Duplex Speech on a Mac

A developer ported NVIDIA's PersonaPlex 7B speech-to-speech model to native Swift using MLX, running full-duplex conversation on Apple Silicon with no cloud, no Python, and faster-than-real-time inference.

Speech Turing Tests, Smart Routing, Pseudocode Agents

New research reveals no speech AI passes a Turing test, adaptive routing slashes LLM costs 82%, and pseudocode planning transforms agent reliability.

NotebookLM Review: Google's AI Research Assistant That Accidentally Invented a New Medium

NotebookLM went viral for turning documents into AI podcasts, but the real story is whether Google has built a genuinely useful research tool or just a clever party trick. We spent a month finding out.

Best AI Voice Generators in 2026 - From API-First to Open Source

A data-driven comparison of the top AI voice generators and TTS tools in 2026, covering ElevenLabs, Fish Audio, OpenAI TTS, LMNT, Cartesia, and open-source alternatives.

Voice ai

Google Analytics