
Qualcomm AI200 - Rack-Scale Inference ASIC
Qualcomm AI200 specs and analysis - a Hexagon-based inference accelerator with 768GB LPDDR per card, rack-scale design, and a focus on inference TCO.

Qualcomm AI200 specs and analysis - a Hexagon-based inference accelerator with 768GB LPDDR per card, rack-scale design, and a focus on inference TCO.

Complete specs and analysis of SambaNova's SN50 RDU - a TSMC 3nm dataflow chip with 3.2 PFLOPS FP8, three-tier memory, and claimed 5x speed over NVIDIA B200.

AWS Trainium2 is Amazon's second-generation custom AI training chip, powering EC2 Trn2 instances with 96GB HBM2e per chip and tight integration with the AWS Neuron SDK and SageMaker ecosystem.

Full specs and analysis of the Cambricon MLU590 - 192GB HBM2e, ~2,400 GB/s bandwidth, TSMC 7nm, and what it means for AI inference outside the NVIDIA ecosystem.

Groq's Language Processing Unit (LPU) is a purpose-built inference ASIC that trades HBM for 230MB of on-chip SRAM, delivering deterministic latency and record-breaking tokens-per-second for LLM serving.

Intel Gaudi 3 is a TSMC 5nm AI accelerator with 128GB HBM2e and 1,835 TFLOPS FP8 performance, positioned as a cost-effective alternative to NVIDIA H100 for training and inference workloads.