Articles Tagged "Inference"

NVIDIA Groq 3 LPU - SRAM-Based Inference Engine

NVIDIA Groq 3 LPU - SRAM-Based Inference Engine

The NVIDIA Groq 3 LPU is a pure-SRAM inference chip delivering 150 TB/s memory bandwidth and 1.2 PFLOPS FP8 per chip, designed to pair with Vera Rubin GPUs for trillion-parameter model serving.

Positron Atlas - FPGA Inference Server

Positron Atlas - FPGA Inference Server

The Positron Atlas is an 8-card FPGA inference server delivering 4.5x better performance per watt than the NVIDIA DGX H200 at 2000W in a single 1U chassis.

South Korea Bets $400M on Rebellions to Rival Nvidia

South Korea Bets $400M on Rebellions to Rival Nvidia

South Korean AI chip startup Rebellions has closed a $400M pre-IPO round at a $2.34B valuation, with the government's Korea National Growth Fund leading Seoul's first direct bet under its K-Nvidia initiative.

Arm Launches AGI CPU, Its First Chip in 35 Years

Arm Launches AGI CPU, Its First Chip in 35 Years

At its Arm Everywhere event in San Francisco, Arm unveiled the AGI CPU - a 136-core data center processor co-developed with Meta and the company's first owned silicon product in its 35-year history.