Veo 3.1 - Google 4K Video Model, Free for Any Account

Google DeepMind's Veo 3.1 generates 4K video with native audio and is now free for every Google account at 10 clips per month via Google Vids.

Veo 3.1 - Google 4K Video Model, Free for Any Account

Overview

Veo 3.1 is Google DeepMind's flagship text-to-video and image-to-video model, released in paid preview on October 15, 2025. It runs on the same veo-3.0-generate-001 foundation as Veo 3, with refined training data rather than a new architecture. Production endpoints: veo-3.1-generate-001 and veo-3.1-fast-generate-001 (GA since November 17, 2025), plus veo-3.1-lite-generate-preview (March 31, 2026) as a cost-optimized tier.

TL;DR

  • Produces 4K video with 48 kHz stereo audio, 8-second clips, 24 FPS, in 16:9 or 9:16 at up to 4K resolution
  • Veo 3.1 Fast sits at Elo 1,205 and Veo 3.1 at Elo 1,209 on the Artificial Analysis Text-to-Video leaderboard, behind Kling 3.0 Pro (1,246) and Dreamina Seedance 2.0 (1,270)
  • Free for every Google account since April 2, 2026 at 10 generations per month via Google Vids; Lite API tier starts at $0.05/s

What makes this release notable is distribution, not raw quality. On April 2, 2026, Google opened Veo 3.1 to every personal Google account via vids.new, creating an addressable base of roughly 3 billion accounts with 10 free 720p generations per month. That's the same week Google shipped Lyria 3 Pro music and directable AI avatars alongside the free rollout. Veo 3.1 isn't the best-scoring model in blind Elo votes - HappyHorse-1.0 from Alibaba leads that board - but it is now the easiest to reach.

Veo 3.1 Ingredients to Video launch collage from the Google blog, showing an astronaut on Mars, a raccoon barista, and several AI-generated reference scenes Veo 3.1's "Ingredients to Video" launch creative, showing the kind of multi-reference generations the model targets for consistent characters and scenes. Source: blog.google

Key Specifications

SpecificationDetails
ProviderGoogle DeepMind
Model FamilyVeo
Foundationveo-3.0-generate-001
Endpointsveo-3.1-generate-001, veo-3.1-fast-generate-001, veo-3.1-lite-generate-preview
ParametersNot disclosed
Architecture3D latent diffusion transformer with joint audio-video denoising
Clip Length4, 6, or 8 seconds per generation; extendable via Flow
Resolutions720p, 1080p (8s only), 4K (8s only, Standard/Fast)
Aspect Ratios16:9 landscape, 9:16 portrait
Frame Rate24 FPS
AudioNative 48 kHz stereo, dialogue + ambient + foley
Input Image Size20 MB max for image-to-video
Max Outputs4 videos per prompt
WatermarkingSynthID on every frame; C2PA Content Credentials
Release DateOctober 15, 2025 (preview), November 17, 2025 (GA)
LicenseProprietary, API-only

Benchmark Performance

The cleanest public measurement for video generation today is the Artificial Analysis Text-to-Video Arena: blind pairwise preference votes with continuously updated Elo. Here's where Veo 3.1 sits:

ModelProviderT2V EloNative AudioMax Resolution
HappyHorse-1.0Alibaba-ATH1,366Yes1080p
Dreamina Seedance 2.0 720pByteDance1,270Yes720p
Kling 3.0 1080p (Pro)KlingAI1,246Partial1080p
Runway Gen-4.5Runway1,217No4K
Veo 3Google1,216Yes1080p
Veo 3.1Google1,209Yes4K
Veo 3.1 FastGoogle1,205Yes4K
Sora 2 ProOpenAI (retired)1,184Yes1080p
LTX-2 FastLightricks1,127Yes4K

Veo 3.1 actually trails Veo 3 by 7 Elo points in blind voting as of April 2026, a signal that the 3.1 release is about adding capabilities (4K, portrait, reference images) rather than raising the ceiling on perceptual quality. The gap to Kling 3.0 Pro and Runway Gen-4.5 is 8 to 37 Elo points, roughly a 52-55% win rate for those competitors in head-to-head comparisons.

On Meta's MovieGenBench, the methodology flips. Google reports Veo 3.1 scoring highest on text alignment across 1,003 human-rated prompts and beating competitors on the physics subset. Those are Google's own evaluations and should be read as such. See our video generation benchmarks leaderboard for VBench, VBench-2.0, and Elo together.

Key Capabilities

Native audio at 48 kHz stereo

Joint audio-visual diffusion is the headline capability Veo 3 shipped and Veo 3.1 refined. Audio and video are created in the same pass, so a dog bark lands on the exact frame where the dog opens its mouth. Dialogue, ambient sound, and foley all come out of one model. Kling and Runway still rely largely on post-generation audio pipelines, so Veo gets lip-sync and event-locking without stitching.

Ingredients to Video and Frames to Video

Three workflow features matter more than marginal Elo. Ingredients to Video takes up to three reference images - a character, a prop, a style plate - and conditions generation on all of them. Frames to Video fills the motion between a start and end frame, useful for matching AI clips to existing footage. Extend chains generations in 7-second increments, up to 20 times, toward sequences near 150 seconds. All three work in Flow, Google AI Studio, Vertex AI, and the Gemini API.

Veo 3.1 Lite promotional graphic from the Google developers announcement on March 31, 2026 Veo 3.1 Lite landed March 31, 2026 as a cost-optimized API tier cutting per-second pricing by more than half versus Veo 3.1 Fast. Source: 9to5google.com

4K and native portrait

Veo 3 topped out at 1080p landscape. Veo 3.1 adds true 4K (3840x2160) for 8-second clips on Standard and Fast, plus native 9:16 portrait trained on real vertical data rather than cropped landscape. For TikTok, Reels, and YouTube Shorts workflows, this closes a capability gap that kept Google behind Kling.

Pricing and Availability

Google runs three API tiers plus a consumer free tier through Google Vids. Published per-second rates settle in this range (Google is in paid preview and hasn't posted a durable rate card; figures reflect third-party reporting):

TierPrice per SecondMax ResolutionNotes
Veo 3.1 Lite$0.05 (720p), $0.08 (1080p)1080pLaunched March 31, 2026; cheapest Google option
Veo 3.1 Fast$0.15 (audio on), ~$0.10 (audio off)4KPrice reductions rolled out April 7, 2026
Veo 3.1 Standard$0.40 (720p/1080p), $0.60 (4K)4KFull-quality model, highest latency

A 4-second 720p Lite clip costs $0.20. A 10-second 1080p Fast clip runs about $1.50. A 10-second 4K Standard clip costs $6.00. Veo 3.1 Lite sits in the same range as Kling's $0.029 to $0.07/second third-party rates without closing the gap, and Standard runs well above Runway Gen-4.5's $0.10-$0.15/second.

Consumer tiers

For the 3 billion personal Google accounts, the entry point is vids.new:

  • Free (any Google account): 10 generations/month at 720p, up to 8 seconds, audio included, SynthID watermarks
  • Google AI Pro (~$22/month): 50 generations/month, Lyria 3 music (30-sec clips), directable AI avatars
  • Google AI Ultra (~$275/month): 1,000 generations/month, Lyria 3 Pro music (up to 3 min), all avatar controls

Workspace accounts under Business Starter, Enterprise Starter, Nonprofit, and Education plans get the free tier as a promotional add-on through May 31, 2026.

API access

Developers get Veo 3.1 via the Gemini API, Google AI Studio, Vertex AI, and partners like fal.ai. Enterprise Vertex AI deployments add VPC Service Controls, audit logging, and Content Credentials (C2PA) metadata alongside SynthID. There's no free API tier; the consumer free allocation only runs inside Google Vids.

Strengths and Weaknesses

Strengths

  • Native 48 kHz stereo audio jointly diffused with video - no post-dubbing pipeline
  • True 4K and native 9:16 portrait; most competitors still cap at 1080p
  • Strongest MovieGenBench prompt adherence in Google's own evaluations, including the physics subset
  • Ingredients to Video (up to 3 reference images) and Scene Extension to 60+ seconds
  • 3 billion Google accounts with 10 free clips/month is the widest addressable base of any video model
  • SynthID on every frame plus C2PA Content Credentials - strongest provenance stack shipping today

Weaknesses

  • Elo 1,209 trails Kling 3.0 Pro (1,246) and Seedance 2.0 (1,270) in blind preference voting
  • Posts a 7-Elo-point regression against Veo 3 on the T2V board: the 3.1 label is about features, not visual ceiling
  • Proprietary, API-only, no weights or self-hosted option
  • 15-40% higher API cost than Veo 3; 25-30% slower with audio enabled
  • No durable per-second rate card published by Google
  • Free tier's 10 clips/month is a funnel, not a working budget; Kling and Runway offer more latitude on their free plans
  • Dialogue-heavy scenes still show intermittent lip-sync drift in published side-by-sides

FAQ

What's different between Veo 3 and Veo 3.1?

Same veo-3.0-generate-001 foundation with refined training data. Visible additions: 4K, native 9:16 portrait, Ingredients to Video (3 reference images), Frames to Video, Scene Extension, and richer audio.

Is Veo 3.1 really free?

For any Google account, yes - 10 generations per month at 720p via vids.new. Output carries SynthID and caps at 8-second clips. Beyond that it's $22/month (AI Pro, 50 clips) or $275/month (AI Ultra, 1,000 clips).

How does Veo 3.1 compare to Kling 3.0 and Runway Gen-4.5?

Kling 3.0 Pro leads on Artificial Analysis Elo (1,246 vs 1,209) and is cheapest per second. Runway Gen-4.5 (1,217) is the pro-editor choice for frame-level control. Veo 3.1 wins on audio sync, MovieGenBench prompt adherence, and distribution.

Can I run Veo 3.1 locally?

No. Veo 3.1 is proprietary and API-only. For self-hosted video generation, LTX-2.3 from Lightricks is the strongest open-weight option and runs on a single RTX 4090 with FP8 quantization.

What model ID do I use on Vertex AI?

veo-3.1-generate-001 (full), veo-3.1-fast-generate-001 (faster/cheaper), veo-3.1-lite-generate-preview (Lite). All take prompt, aspect ratio, duration (4/6/8 seconds), resolution (720p/1080p/4K), and an optional reference image.

Sources

✓ Last verified April 21, 2026

Veo 3.1 - Google 4K Video Model, Free for Any Account
About the author AI Benchmarks & Tools Analyst

James is a software engineer turned tech writer who spent six years building backend systems at a fintech startup in Chicago before pivoting to full-time analysis of AI tools and infrastructure.