Etched Exits Stealth With Working Chip and $1B in Orders
Transformer ASIC startup Etched comes out of stealth with first-pass silicon on TSMC N4P, $800M raised, and more than $1B in signed customer contracts.

Four years after making a very concentrated bet - that transformer architecture would stay dominant long enough to justify hardwiring it into silicon - Etched came out of stealth today with a working chip, $800M in total funding, and over $1 billion in signed customer contracts.
This isn't a paper launch. The company says it achieved first-pass silicon success on TSMC's N4P process, which means the chip worked on the first mask. It's now validating its rack-scale inference systems with customers and plans to start shipping this summer.
Key Facts
| Detail | Value |
|---|---|
| Process node | TSMC N4P |
| Memory per chip | 144GB HBM3E |
| Claimed throughput | 500K+ tokens/sec (8-chip server, Llama 70B) |
| Total raised | $800M |
| Signed contracts | $1B+ |
| Employees | 400+ |
| Planned shipping | Summer 2026 |
First-Pass Silicon Is Not a Small Thing
In chip design, "first-pass silicon success" - sometimes called A0 success - means the chip returned from the fab and functioned as intended without requiring a redesign iteration. Most complex chips need at least one or two respins before they work reliably. The fact that Sohu worked on the first attempt on TSMC's N4P process - a production node - suggests that Etched's design methodology is solid.
The Chip That Does One Thing
Sohu is built differently from a GPU. Where Nvidia's H100 and H200 are programmable - you can use them for training, inference, rendering, or scientific compute - Sohu hardwires transformer attention directly into the transistor logic as fixed-function circuitry. There's no general-purpose compute core.
The tradeoff is obvious and the company has never hidden it: everything that isn't a transformer pays a penalty or doesn't run. The upside is that everything that's a transformer runs at a level of efficiency no programmable chip can match.
Etched says the current systems support DeepSeek, Qwen, Mamba, and Llama workloads. The inclusion of Mamba - an SSM architecture, not a standard transformer - suggests the chip has broader architecture support than the original "transformer-only" framing implied. The company hasn't published the technical details explaining how Mamba support was implemented.
The Sohu chip card. The chip hardwires transformer attention into silicon as fixed-function logic.
Source: etched.com
Full-Stack, Not Just the Die
Etched didn't just design a chip. The company designed the entire server rack: cooling plates, networking, power delivery, and circuit boards. The server shown in their launch materials has a distinctive form factor built around liquid cooling - every component co-designed with the chip's thermal requirements.
This full-stack approach mirrors what Google did with the TPU pod: when you control every layer from the die to the rack, you can optimize the entire power and thermal budget together rather than fitting a general-purpose board to a chip someone else designed.
$800M Raised and $1B in Signed Contracts
The $800M total comes from multiple rounds. The most recent was $500M in December at a $5B post-money valuation, led by Stripes. Prior to that, Jane Street committed more than $100M in a round Etched kept quiet until today. VentureTech Alliance - a venture firm with a strategic partnership with TSMC - also participated, providing a direct line to Etched's manufacturing partner.
Etched's inference server rack. The company designed the full hardware stack - chip, board, cooling, and rack - rather than adapting existing server designs.
Source: etched.com
Who Backed It
| Investor | Type | Striking Detail |
|---|---|---|
| Stripes | Lead, Series B ($500M round) | VC firm |
| Jane Street | Trading firm | $100M+ committed |
| VentureTech Alliance | TSMC-linked VC | Strategic manufacturing link |
| Hudson River Trading | High-frequency trading | |
| Jump Trading | High-frequency trading | |
| Two Sigma | Quant / HFT | |
| Ribbit Capital | Fintech VC | |
| Peter Thiel | Individual | Existing backer |
| Geoffrey Hinton, Fei-Fei Li, Andrej Karpathy | AI researchers | Individual angels |
| Stanley Druckenmiller | Hedge fund | Individual angel |
The concentration of high-frequency trading firms in this cap table - Jane Street, Hudson River Trading, Jump Trading, Two Sigma - is standout. These firms built their business around low-latency compute and are among the most sophisticated hardware buyers in the market. When firms like that write checks this size into an inference chip startup, it suggests they see a use case they aren't advertising publicly.
$1B in Signed Contracts
The company is being careful with language: these are signed customer contracts, not letters of intent or pipeline. CEO Gavin Uberti said the company saw "frontier AI would become one of the most economically significant technologies ever created, but the needed infrastructure simply did not exist" - and found enough customers who agreed to sign before the chip shipped.
Ramping production to fulfill $1B in contracts requires significant manufacturing capacity. Etched has a Taiwan factory and a data center and prototyping lab in San Jose. The company is targeting gigawatt-scale operations by 2027.
How Sohu Stacks Up on Paper
These are vendor-supplied numbers unless otherwise noted. No independent organization has benchmarked Sohu at production scale.
| Metric | Sohu (8-chip server) | H100 (8-chip server) | H200 (8-chip server) |
|---|---|---|---|
| Tokens/sec, Llama 70B (batch 1) | 500,000+ | ~23,000 | ~35,000 |
| Memory per chip | 144GB HBM3E | 80GB HBM3 | 141GB HBM3E |
| Process node | TSMC N4P | TSMC 4N | TSMC 4N |
| Architecture | Transformer ASIC | General-purpose GPU | General-purpose GPU |
| Software stack | Proprietary compiler | CUDA + vLLM | CUDA + vLLM |
| Independent benchmarks | None published | Extensive | Extensive |
| Availability | Summer 2026 | Now | Now |
The 20x throughput claim is real on the narrow benchmark it was measured on. A single Sohu server claims to replace roughly 160 H100 GPUs for Llama 70B inference at batch size 1. Those are exceptional numbers for a latency-bound workload.
Etched is entering a field that already has serious challengers. Groq's LPU built a commercially available inference product around similar throughput-focused logic. Cerebras took a different architectural path and has been shipping to enterprise and government customers. Neither has displaced Nvidia. The question for Etched is whether its numbers justify the engineering cost of migration.
Inside the Etched server: water cooling lines and copper heat exchangers co-designed with the Sohu chip's thermal profile.
Source: etched.com
Where It Falls Short
Vendor Benchmarks Only
Every throughput number in Etched's announcement comes from Etched's own controlled demonstrations. The 500K tokens/sec figure is measured at batch size 1 - a single concurrent request. This is the best possible scenario for a chip optimized for sequential token generation.
At batch size 256 - closer to what a real inference API handles under load - a single H100 delivers around 45,000 tokens per second. Etched hasn't published comparable batch-256 numbers. The gap between batch-1 and batch-256 performance is where vendor claims and production reality tend to diverge.
The Proprietary Stack
Rolling out on Sohu means abandoning vLLM, TensorRT-LLM, and SGLang - the three most widely used open-source inference frameworks. Etched supplies its own compiler and software stack. For an engineering team running production inference today, adopting an unproven toolchain from a startup that has not yet shipped at scale is a meaningful risk, regardless of the performance ceiling.
The company has over 400 employees, with a significant portion recruited from Nvidia, Broadcom, Google TPU, and SK Hynix. The talent base is credible. But a proprietary software stack needs more than good engineers - it needs the ecosystem adoption that makes debugging tractable at 3am when a model is throwing errors.
The Architectural Bet Still Has Tail Risk
Etched's founders have acknowledged the risk clearly: if transformer architecture becomes obsolete, the company's chips become obsolete with it. The rise of MoE models like DeepSeek V4 and hybrid SSM-transformer architectures are already pushing the definition of "transformer" well beyond the original formulation. The Mamba support Etched now claims was not in the original product pitch.
"If you have compute now, people will buy it."
Patrick O'Shaughnessy, CEO of Positive Sum (Etched investor), captured the thesis in one sentence. The bet is that inference demand is so large and so urgent that a chip delivering 20x throughput on a specific workload will find customers before a more flexible architecture catches up on that workload.
Summer 2026 is when that thesis meets production traffic. The $1B in contracts suggests the customers are already waiting.
Sources:
- Etched Emerges From Stealth With Working Chip, $800M Raised, and Over $1B in Customer Contracts - Yahoo Finance / GlobeNewswire
- Nvidia rival Etched raises $800M with backing from Jane Street and a TSMC-linked fund - The Next Web
- Nvidia competitor Etched hits $5B valuation, $1B in sales for AI chip - TechCrunch
- Etched AI Sohu vs NVIDIA: Transformer ASIC vs General-Purpose GPU for LLM Inference - Spheron Blog
- From Dorm Room Beginnings to a Pioneer in the AI Chip Revolution - Rambus
