Vast Data Raises $1B at $30B, NVIDIA Backs AI Storage
Vast Data closes a $1B Series F at $30B valuation - triple its 2023 price - with NVIDIA, Drive Capital, and Access Industries backing its push to own the data layer for AI infrastructure.

Vast Data has closed a $1 billion Series F at a $30 billion valuation, tripling its price in just 16 months. The round is led by Drive Capital and Access Industries, with NVIDIA continuing as an investor alongside Fidelity Management and Research Company and NEA. Named customers include CoreWeave, Cursor, Mistral AI, JPMorgan Chase, and the U.S. Air Force.
TL;DR
- $1B Series F closes at $30B valuation - up from $9.1B in December 2023, a 3x jump in 16 months
- Lead investors: Drive Capital and Access Industries; NVIDIA, Fidelity, and NEA also participating
- $4B+ cumulative bookings, $500M+ ARR, positive free cash flow - the company says it didn't need the capital
- Customers include CoreWeave ($1.17B deal), Cursor, Mistral AI, JPMorgan Chase, U.S. Air Force
- AgentEngine coming in 2026: a deployment layer for AI agents with MCP integration and multi-agent audit trails
The $30 Billion Case for Storage
Vast Data CEO Renen Hallak was characteristically direct when asked why the company raised at all: "We never go out looking for funding rounds. We get inbounds all the time. We ask them, how much do you want to invest?"
That kind of confidence is easier to sustain when the underlying metrics hold. The company posted over $500 million in committed ARR at fiscal year-end, cumulative bookings above $4 billion, and has been cash-flow positive for years. Its Rule of X score - a combined growth-and-profitability multiple that investors use to benchmark software businesses - sits at 228%, a figure that would stand out in enterprise software at any scale.
The round is a mix of primary and secondary capital, giving early investors and employees liquidity without requiring the company to pursue an IPO on anyone else's timeline. An IPO is anticipated before the end of 2026, but Vast Data isn't being pushed toward it by an empty cash account.
Why the Multiple Makes Sense Now
A $30 billion valuation for a storage software company would have been hard to justify before 2024. Traditional incumbents - NetApp, Pure Storage, IBM - trade at lower multiples even in favorable markets, because storage has historically been treated as commodity infrastructure, competing on price-per-terabyte rather than capability.
AI changed that framing. Training large models produces tens to hundreds of petabytes of checkpoints, datasets, and intermediate outputs. Inference at scale requires serving embedding stores and KV caches at throughput rates that legacy storage architectures weren't built to handle. The GPU cluster is no longer waiting on the CPU; increasingly, GPUs wait on storage. Whoever owns that bottleneck owns significant leverage in the AI stack.
Vast Data's bet is that it built for this environment from the ground up while incumbents are retrofitting architectures designed for virtual machines and block storage.
How DASE Works - and Why It Matters
The technical foundation is an architecture the company calls DASE - Disaggregated Shared Everything. The design separates compute logic from storage capacity so the two scale independently. Stateless CPUs handle processing while data lives in a single pool of commodity flash SSDs, removing the tiered hot/warm/cold hierarchy that creates bottlenecks in conventional NAS and SAN systems.
In practice, this means customers add GPU-adjacent compute without moving or replicating data. A single VAST cluster can span environments containing hundreds of thousands of GPUs globally. The flash-first approach also removes the engineering overhead of manual data classification that absorbs operational time in conventional enterprise storage deployments.
"VAST sits at the center of how that system works, which is why we are seeing this level of demand at global scale."
-- Renen Hallak, Founder and CEO
Renen Hallak, Founder and CEO of VAST Data, speaking at a SiliconANGLE event.
Source: siliconangle.com
The company has extended this foundation into what it markets as an "AI Operating System for GPUs" - a unified platform combining data storage, compute adjacency, and real-time processing. Whether that positioning is engineering substance or marketing language is a question the customer list starts to answer. Mistral AI, Cursor, Crusoe, and JPMorgan Chase are all organizations with real AI workloads; they're not buying storage for marketing reasons.
The AgentEngine Expansion
Announced alongside the funding round, AgentEngine is a new product slated for 2026 that takes Vast Data further up the stack. The pitch: a production-grade deployment layer for AI agents, with containerized runtimes, lifecycle management, MCP protocol integration, and full auditability across multi-agent pipelines.
This is a material shift from "we store the data your AI trains on" to "we run the agents that operate on that data." Whether that extension holds technically will depend on how AgentEngine performs against dedicated orchestration platforms. The customer list gives it credible pilot candidates - Cursor and Mistral AI in particular run non-trivial agentic workflows.
The NVIDIA Investment
NVIDIA's position in this round is strategic in a way that its model-lab investments aren't. GPU clusters are only as useful as the data they can reach at the speed they can process it. When storage is the constraint, GPU utilization drops and the economics of AI compute degrade.
NVIDIA's investment in Vast Data is a hedge and an accelerant: if DASE becomes the dominant storage substrate for AI workloads, NVIDIA-powered clusters perform better relative to their alternatives. The value flows back. NVIDIA has also backed CoreWeave, which is itself one of Vast Data's largest customers, creating an interlocking infrastructure chain that benefits all three when AI compute demand grows.
The CoreWeave Anchor
In November 2025, Vast Data announced a multi-year commercial partnership with CoreWeave valued at $1.17 billion - among the largest enterprise software contracts in storage history at the time. The deal makes Vast the primary data platform for CoreWeave's AI cloud infrastructure, combining CoreWeave's GPU-dense clusters with VAST's software for continuous model training, real-time inference, and large-scale data processing.
The partnership did two things for Vast Data beyond the revenue. It provided a reference architecture for what GPU-scale AI storage looks like in production, at a customer whose own scale verifies the design. And it gave investors a verifiable, multi-year contract to anchor the growth story. When your largest customer completes a multi-billion-dollar IPO of its own, that credibility carries into your next fundraise.
Modern data centers serving AI workloads require storage architectures purpose-built for high-throughput GPU access, not legacy NAS and SAN designs.
Source: pexels.com
What the Competition Looks Like
The AI storage market has real players. Pure Storage has invested in positioning its FlashBlade lines for AI workloads and has its own NVIDIA partnerships. WEKA - which targets similar GPU-adjacent storage use cases - has raised substantial capital in the same period. NetApp has pushed its AI data infrastructure story through integrations with NVIDIA's DGX platform. IBM's Storage Scale targets HPC and AI pipelines with similar architectural ambitions.
What none of those competitors have is the specific combination that Vast Data presents: the DASE architecture purpose-built for flash, the CoreWeave anchor deal at billion-dollar scale, the NVIDIA investment creating downstream alignment, and the customer roster spanning AI labs, hyperscaler infrastructure, and regulated enterprise simultaneously.
The structural risk is that hyperscalers can build their own storage substrate with enough engineering investment - and that as AI workloads mature, the differentiation window may narrow. Vast Data's IPO timeline and the capital raised suggest leadership believes the moat is wide enough to make that a problem for a later date.
Storage was never supposed to be this interesting. The $30 billion valuation reflects a market that's recalibrating which parts of the AI stack deserve frontier-grade investment - and concluding that the layer where model weights live, where training data flows, and where inference caches reside isn't a commodity after all.
Sources:
