Luma Launches Agents for End-to-End Creative Work

Luma AI's new Agents platform, powered by the Uni-1 Unified Intelligence model, lets creative teams go from a written brief to finished video, images, and audio in one workflow.

Luma Launches Agents for End-to-End Creative Work

Luma AI today launched Luma Agents, a platform that takes a creative brief and handles the full production pipeline - writing copy, creating images, producing video clips, and layering in audio - without requiring teams to jump between separate tools. At the core is Uni-1, Luma's new Unified Intelligence model, which the company positions as a meaningful architectural departure from the existing generation of multimodal systems.

TL;DR

  • Luma AI launched Luma Agents today, built on Uni-1 - its first Unified Intelligence model
  • Agents plan and produce text, images, video, and audio from a single creative brief
  • Coordinates 8+ external models including Ray3.14, Google Veo 3, ByteDance Seedream, and ElevenLabs
  • API access open now with gradual rollout; Publicis Groupe, Adidas, and Mazda are among the first enterprise customers

What Luma Agents Actually Do

The pitch is straightforward: drop in a brief and an asset (the demo used a product image of a lipstick tube), and the system produces a full set of ad campaign variations - location suggestions, model selections, color scheme options, scripted video clips, voiceover. Teams steer the direction through conversation rather than re-prompting each output from scratch.

That persistent context is the key differentiator Luma is stressing. Most current AI agent setups require developers to wire together separate models, pass context manually, and handle failures at each boundary. Luma Agents are meant to manage that coordination layer themselves, maintaining state across assets, contributors, and iterations.

The Uni-1 Core

Uni-1 is a decoder-only autoregressive transformer built on a shared token space that interleaves language tokens and image tokens natively. Rather than routing inputs to separate encoders for each modality, the model processes text and pixels in the same forward pass. CEO Amit Jain describes the result as "pixel intelligence" - the model can reason in language and render directly into image space without an intermediate step.

The architecture aims to support adjustable chain-of-thought reasoning, meaning the system can spend more compute working through a complex brief before creating any output. That matters for creative workflows where a poorly planned first step wastes expensive generation budget downstream.

The Orchestration Layer

Uni-1 handles planning and reasoning, but production-quality output relies on routing subtasks to specialized models. Luma Agents coordinates with an external model stack that the system selects automatically based on task requirements:

ModelProviderRole
Ray3.14Luma AIPrimary video generation (1080p, 4x speed)
Veo 3GoogleSecondary video with native audio generation
SeedreamByteDanceImage generation for storyboard frames
ElevenLabsElevenLabsVoice and audio synthesis
Nano Banana ProGoogleLightweight inference tasks
GPT Image 1.5OpenAIImage generation fallback and editing

The orchestration layer selects and routes tasks automatically, evaluates outputs against the original brief, and loops back for refinement when results fall short. That self-critique cycle - generate, evaluate, regenerate - is where Luma's team says the persistent context provides real value: the system knows what it already tried.

# Luma Agents REST API - submit a creative brief
curl -X POST https://api.lumalabs.ai/v1/agents/runs \
  -H "Authorization: Bearer $LUMA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "brief": "30-second product ad for summer lipstick campaign, bold colors, outdoor setting",
    "assets": [{"type": "image", "url": "https://...product.jpg"}],
    "outputs": ["video", "image_variants", "copy"],
    "reasoning_effort": "high"
  }'

The reasoning_effort parameter controls how much planning compute Uni-1 uses before starting generation - a design pattern borrowed from reasoning model APIs and relevant here because longer brief planning correlates with fewer wasted generation cycles.

Luma AI CEO Amit Jain speaks at Web Summit Qatar in early 2026 Amit Jain at Web Summit Qatar in February 2026, where he discussed Luma's expansion plans and the company's bet on multimodal intelligence as the path to AGI.

Enterprise Deployment

Luma isn't doing a wide public launch. Access is via API with a gradual rollout - the company says it wants to avoid the capacity problems that have hit similar platforms at launch. Publicis Groupe and Serviceplan are already running the system for client work, with Adidas, Mazda, and Saudi AI company Humain listed as active brand deployments.

For reference, Luma closed a $900 million Series C last year, led by Humain with AMD also participating. The funding backed Project Halo, a 2GW compute supercluster that's supposed to begin deployment this quarter - which gives some context for why Luma can afford to run a coordination layer on top of third-party model APIs while still controlling inference costs.

The timing is deliberate. Luma also announced a $1 million Cannes Lions competition earlier this month, aimed at agency creatives. The Agents launch pairs well with that - the company is clearly targeting the ad and marketing production stack as its first commercial wedge, which is smart given how much of that work is already multimodal (storyboard, script, video, VO) and how poorly it maps onto single-purpose tools.

A creative marketing team works through a campaign strategy, the kind of workflow Luma Agents is designed to automate end-to-end The end-to-end ad campaign pipeline - brief to final delivery across text, images, video, and audio - is the specific workflow Luma Agents targets.

Where It Falls Short

A few things to watch before trusting this in production.

Model selection opacity. Luma Agents picks which external model to use for each task, but the routing logic isn't exposed to developers. If Veo 3 gets rate-limited or Ray3.14 produces an off-brand result, there's no documented way to override the selection or inspect why the system made that choice. For agencies with client-specific brand guardrails, that black box is a risk.

No benchmark data yet. Luma hasn't released any independent evaluation of Uni-1 against comparable multimodal models. The demos look polished, but the multimodal benchmark leaderboard picture is incomplete until third parties can run standardized evals. The claim that this is meaningfully different from prompt-chaining across separate APIs needs verification.

External model dependencies. The platform's quality ceiling is partly determined by the external models it coordinates. If Google deprecates an API or ByteDance access changes, the orchestration stack breaks in ways Luma doesn't fully control. That is a real infrastructure risk for teams building production pipelines on top of this.

Gradual rollout means no SLA. The "gradual access" framing means enterprise teams cannot plan around guaranteed availability, at least not yet.

If you are assessing this against the existing landscape of AI video generation tools, the honest comparison isn't Luma Agents vs. individual video generators - it is Luma Agents vs. building your own orchestration layer using something like an AI agent framework plus individual API integrations. Luma is betting that most creative teams don't want to do that plumbing work themselves. For agencies at scale, that bet is probably right.


The underlying architecture here - a natively multimodal reasoning model that can delegate to specialized generators - is a cleaner design than bolting together existing LLMs and diffusion models with ad-hoc orchestration. Whether Uni-1 actually delivers on the reasoning quality that justifies that architecture is the question that the next few months of production use will answer.

Sources:

Luma Launches Agents for End-to-End Creative Work
About the author AI Infrastructure & Open Source Reporter

Sophie is a journalist and former systems engineer who covers AI infrastructure, open-source models, and the developer tooling ecosystem.