Factory Raises $150M to Scale Enterprise AI Droids

Factory, the three-year-old startup building autonomous software development agents, closed a $150 million Series C on April 16 at a $1.5 billion valuation. Khosla Ventures led the round, with Sequoia Capital, Blackstone, Insight Partners, and a handful of smaller funds participating. Keith Rabois, a managing director at Khosla, joined the board.

The raise lands in a market that's gotten crowded fast. Cursor closed an oversubscribed enterprise round targeting a $50 billion valuation just days later. GitHub Copilot has over 1.8 million paid seats. Anthropic's Claude Code is eating into developer workflows from a different angle. Factory's pitch is that all of them are solving the wrong problem: they help developers write code faster, but they don't automate the rest of what a software team does.

TL;DR

$150M Series C led by Khosla Ventures; valuation $1.5B
Revenue doubled month-over-month for six consecutive months
Droids handle testing, review, docs, migrations - not just generation
Model-agnostic platform switches between Claude, DeepSeek, and others
Customers include Morgan Stanley, EY, Adobe, Palo Alto Networks, Adyen

What Factory Actually Builds

Factory's core product is a set of agents called Droids, designed to operate across the full software development lifecycle. The company's founders - CEO Matan Grinberg, who left a Berkeley physics PhD program, and CTO Eno Reyes, who was a ML engineer at Hugging Face before co-founding the company - met at Princeton and reconnected at a hackathon. Both were named to Forbes' 30 Under 30 in 2025.

Their bet from day one was that enterprises don't need a smarter autocomplete. They need agents that can take a task from an issue tracker and ship a pull request - with tests, documentation, and code review included.

Droids - the SDLC agent layer

Droids are Factory's individual task agents. The original use case was automating the work developers hate most: test generation, documentation updates, code review, refactoring, and migrations. Those tasks are well-defined, have clear acceptance criteria, and are tedious enough that most engineers defer them indefinitely.

According to Factory's benchmarks, Droids rank first on Terminal-Bench 2 and SWE-Bench Verified among software development agents. Those numbers should come with the usual benchmark caveats - Factory wrote Legacy-Bench themselves, and the two independent benchmarks (Terminal-Bench, SWE-Bench) measure capabilities that may not generalize to your specific codebase.

Droids integrate directly into existing tooling rather than replacing it. The five access paths are IDE (VS Code, JetBrains, Vim), desktop app, CLI, Slack or Teams, and project management tools like Linear. Triggering a Droid from a Linear issue assignment and getting a draft PR back is the demo most prospects see first.

Missions - multi-agent long-horizon workflows

Missions are Factory's answer to tasks that take days rather than hours. A Mission breaks down a complex objective - say, migrating a monolith service to a new database client - into subtasks, assigns them to individual Droids, and coordinates the work. Each Droid maintains its own context and can call back to a coordinating agent when it hits ambiguity.

Grinberg described the goal to the Wall Street Journal as building agents that handle "weeks' worth of work." That's an aspiration the platform is still working toward; Missions are the architectural attempt to get there.

Factory Desktop and CLI

Factory Desktop is a native macOS and Windows application where each Droid gets a persistent cloud machine - not a temporary sandbox. Sessions survive restarts, maintain file system state, and can be monitored across devices. The sidebar shows all active Droid sessions simultaneously, so engineers can kick off a feature build and a database migration in parallel without losing context on either.

The CLI is the integration layer for automation teams. Install via Homebrew or curl, then run from inside any repo:

# Install the Factory Droid CLI
brew install --cask droid

# Navigate to your project
cd /path/to/your/project

# Start an interactive Droid session
droid

# Switch models mid-session with a slash command
/model deepseek

# Trigger a code review pass
/review

The /model command is the model-agnostic architecture in practice. Factory routes tasks between Claude, DeepSeek, GPT-5.3-Codex, and other providers based on cost and capability. Enterprises on regulated infrastructure can pin a specific provider; everyone else lets Factory route automatically.

Robotic arm on an automated assembly line Factory's "Droids" metaphor maps directly to industrial automation - autonomous agents that handle repeatable technical tasks without human supervision for each step. Source: unsplash.com

The Funding Stack

Factory has raised in stages that track its growth. Sequoia seeded the company after Grinberg cold-emailed partner Shaun Maguire with a note about shared interests in physics research. Within six months of founding, Factory had signed two public companies and two decacorns as customers.

The $50M Series B in September 2025 came at a $300 million valuation. The $150M Series C values the company five times higher, which maps to the revenue growth: six consecutive months of month-over-month doubling. That compounding math doesn't stay linear - the company is approaching a scale where doubling gets harder - but it's a strong signal that enterprise buyers are renewing and expanding, not just trialing.

Round	Amount	Valuation	Lead
Seed	Undisclosed	-	Sequoia Capital
Series B	$50M	$300M	Sequoia Capital
Series C	$150M	$1.5B	Khosla Ventures

Blackstone's participation is worth noting. The firm isn't a typical early-stage AI investor. Its presence signals that institutional buyers at the largest enterprises - the clients Blackstone knows well - are actively evaluating Factory, not just experimenting with it.

Enterprise software development team in a modern office Factory's paying customers include engineering teams at Morgan Stanley, Ernst & Young, Adobe, Palo Alto Networks, and Adyen - firms with strict compliance and security requirements that ruled out many AI coding tools. Source: unsplash.com

Legacy-Bench - Measuring the Hard Stuff

Factory published Legacy-Bench to quantify something the standard benchmarks ignore: how well AI agents handle the code that actually runs most enterprise infrastructure. The benchmark covers COBOL, Fortran, Java 7, C, BASIC, and Assembly, with tasks split across bug fixing, implementation, and migration.

The results are honest in a way that's worth checking. Pass rates across 12 model-agent combinations ran from 16.9% to 42.5% overall. GPT-5.3-Codex led at 42.5% but was weakest on C. Gemini 3.1 Pro showed the most balanced profile at 38.7%. Opus 4.6 topped C performance at 34.3% but scored 18.1% on COBOL.

The benchmark's own finding: 44 tasks were unsolvable by any model tested, and 31 of those were COBOL. That language remains a wall. The same frontier models that score above 70% on SWE-Bench Verified drop to single and low double digits on COBOL tasks. Factory's Droids aren't immune to that ceiling.

Still, the benchmark is truly useful for enterprises evaluating AI agents for modernization work. The fact that Factory published results showing their own product's limits is more credible than a results sheet showing only wins.

Where It Falls Short

The model-agnostic pitch cuts both ways. Flexibility is real - customers aren't locked to a single provider and can respond to pricing shifts or capability jumps from any lab. But model-agnostic also means no deep optimization for any specific model. Anthropic's Claude Code integrates tightly with Claude's extended thinking and long context. Cursor has spent years improving its model routing for fast, interactive coding. Factory's routing layer is a layer of abstraction over those integrations, not a replacement for them.

The competitive field is also getting more compressed at the top. When Cursor is raising at $50B and Claude Code is shipping parallel agents and desktop control, the market for AI coding infrastructure is no longer a wide-open space. Factory's differentiation - full SDLC automation rather than coding assistance - is a real distinction, but it requires enterprises to trust autonomous agents with test suites, documentation, and code review, not just code generation. That trust takes time to earn.

Factory's COBOL wall is the clearest example of where current models still break. Enterprises with mainframe infrastructure represent a substantial market for modernization tooling, and none of the current frontier agents can reliably close that work. Factory acknowledges this in Legacy-Bench. It's a gap the company is explicitly trying to close.

At $1.5B, Factory is priced for continued compounding. The revenue trajectory supports the valuation today. Whether Droids can truly automate weeks of engineering work - not hours - is the question the next round will have to answer.

Sources: