TL;DR

20-core ARM CPU (10x Cortex-X925 + 10x Cortex-A725) paired with a Blackwell RTX GPU carrying 6,144 CUDA cores
128GB LPDDR5x unified memory at 300 GB/s - runs 120-billion-parameter models locally with 1M token context
1 petaFLOP FP4 AI compute via 5th-generation Tensor Cores; 70 billion transistors on TSMC 3nm
NVIDIA's play against Apple Silicon and Qualcomm Snapdragon for the high-end Windows AI PC market

Overview

NVIDIA announced the RTX Spark superchip at Computex 2026 on June 1, marking the company's formal entry into the personal computing SoC market. RTX Spark combines a 20-core NVIDIA Grace CPU with a Blackwell RTX GPU on a single die, connected via NVIDIA's NVLink-C2C chip-to-chip interconnect. The result is a 70-billion-transistor SoC built on TSMC's 3nm process that shares a unified 128GB pool of LPDDR5x memory across both processors.

This is NVIDIA positioning itself in a market Apple has controlled with the M-series. The M4 Max (see our coverage) offers 546 GB/s of memory bandwidth against RTX Spark's 300 GB/s - a real gap. But NVIDIA's counter is architectural: 5th-generation Tensor Cores with native FP4 support give the Blackwell GPU capabilities Apple's Neural Engine doesn't match at this form factor. Running a 120-billion-parameter model locally isn't something any current M4 device can do without quantization tricks that sacrifice quality.

The same underlying chip already ships inside the DGX Spark personal AI supercomputer at $4,699 - a product NVIDIA launched at GTC 2026. RTX Spark rebrands the GB10 Grace Blackwell chip for OEM laptop and desktop integration, with over 30 laptop models expected from ASUS, Dell, HP, Lenovo, Microsoft, and MSI arriving fall 2026. Microsoft's Surface Laptop Ultra is among the first announced.

NVIDIA isn't claiming RTX Spark beats Apple on every axis. It isn't. But for local AI inference with large models, the Tensor Core and FP4 advantage matters more than peak memory bandwidth.

Key Specifications

Specification	Details
Manufacturer	NVIDIA
Product Family	RTX Spark (GB10 Grace Blackwell)
Chip Type	SoC
Process Node	TSMC 3nm
Transistor Count	70 billion
CPU Architecture	NVIDIA Grace (ARM)
CPU Cores	20 (10x Cortex-X925 + 10x Cortex-A725)
CPU Max Frequency	4.1 GHz (performance cores)
GPU Architecture	NVIDIA Blackwell
CUDA Cores	6,144
Tensor Cores	5th Generation (FP4 support)
RT Cores	4th Generation
Memory	128GB LPDDR5x
Memory Interface	256-bit
Memory Bandwidth	300 GB/s
AI Performance	1,000 TOPS / 1 PFLOP (FP4 with sparsity)
CPU-GPU Interconnect	NVLink-C2C
TDP (SoC)	140W
Release Date	June 2026 (DGX Spark) / Fall 2026 (OEM devices)

Performance Benchmarks

NVIDIA hasn't published third-party benchmark data for RTX Spark laptops yet - OEM devices don't ship until fall 2026. The DGX Spark desktop (same chip) has built up some real-world numbers since its GTC 2026 launch.

Metric	RTX Spark / DGX Spark	Apple M4 Max	Qualcomm X Elite
AI Compute (FP4)	1,000 TOPS	~40 TOPS (Neural Engine)	45 TOPS (NPU)
Unified Memory	128GB	128GB (max config)	64GB (max)
Memory Bandwidth	300 GB/s	546 GB/s	135 GB/s
Max LLM Parameters (local)	120B	~40-50B (practical)	~13B (practical)
TDP (SoC)	140W	~36W (M4 Max)	~23W
Process Node	3nm	3nm	4nm

The memory bandwidth deficit versus Apple is real. For memory-bandwidth-bound inference - which describes most LLM serving - a M4 Max will outrun a RTX Spark on per-watt throughput for the same model size. The trade flips for model sizes above roughly 50 billion parameters, where RTX Spark can run natively in FP4 while Apple devices must resort to aggressive compression.

NVIDIA claims RTX Spark laptops run Adobe Premiere and Photoshop twice as fast as competing platforms after application-level optimization. These claims are unverified by independent testing at publication date and will be updated when review hardware ships.

Key Capabilities

NVLink-C2C Integration

The Grace CPU and Blackwell GPU share memory directly through NVLink-C2C rather than going through a discrete PCIe connection. This eliminates the copy overhead that penalizes GPU-offloaded tasks on conventional designs. For agentic AI workloads - where a language model, a vision system, and a code executor might all be running simultaneously - shared memory with zero-copy access is a meaningful advantage over PCIe-connected discrete GPU designs.

FP4 Tensor Core Support

5th-generation Tensor Cores handle native FP4 matrix multiplication with sparsity acceleration. At FP4, the Blackwell GPU can run models twice the size of FP8 in the same memory budget, or achieve roughly 2x throughput on the same model. Apple's Neural Engine doesn't natively support FP4 at this precision level. Qualcomm's NPU supports INT4 but not the OCP MX FP4 format NVIDIA uses, which has better dynamic range than integer quantization.

Local Reasoning with Long Context

NVIDIA's specification sheet claims 1 million token context length on-device at up to 120 billion parameters. In practice, context length and model size interact through the KV cache - a 1M context at 120B parameters would demand massive KV cache memory - but the raw memory pool of 128GB makes it feasible where 64GB devices cannot go.

Gaming and Content Creation

RTX Spark isn't only an AI chip. 6,144 CUDA cores with DLSS 4.5 Multi Frame Generation targets 100 FPS at 1440p in games. The Blackwell media engine handles 12K 4:2:2 video editing and 4K AI video generation locally. For creators who also do AI work, this replaces the awkward two-device workflow of a MacBook and a separate AI workstation.

Pricing and Availability

Microsoft Surface Laptop Ultra - one of the first OEM devices built around the NVIDIA RTX Spark superchip, announced at Computex 2026. Source: windowscentral.com

The DGX Spark desktop - the first commercial product using the GB10 chip - launched at $4,699 for the 4TB storage configuration. That price covers the full desktop system including power supply and NVMe storage, not just the chip.

OEM laptop pricing hasn't been announced. Reference points from competing premium platforms suggest RTX Spark laptops will land in the $2,500-$5,000 range based on the MacBook Pro M4 Max starting at $2,499. Microsoft's Surface Laptop Ultra, Dell's XPS 16, HP OmniBook X 14, and Lenovo Yoga Pro 9n are among the first wave.

For cloud access, NVIDIA's own DGX Cloud platform offers RTX Spark compute for developers who want to test the platform before hardware ships.

Strengths

1 petaFLOP FP4 in a laptop form factor - nothing else at this TDP comes close
128GB unified memory runs models no other laptop chip can fit
Full NVIDIA software stack: CUDA, TensorRT, NIM microservices, DLSS, RTX
Gaming and content creation performance alongside AI workloads in one chip
NVLink-C2C removes PCIe bottleneck between CPU and GPU

Weaknesses

300 GB/s memory bandwidth trails Apple M4 Max's 546 GB/s significantly
140W TDP limits battery life and thin-laptop form factors
OEM software optimization still maturing - Windows ARM ecosystem remains incomplete compared to macOS
No confirmed pricing for OEM devices yet
Fall 2026 availability for laptop/desktop products; only DGX Spark desktop ships now

NVIDIA RTX 5090 - Blackwell Desktop GPU - the discrete GPU option for builders who don't want a SoC
Apple M5 Max - the primary competitor in the high-end laptop AI compute space
NVIDIA GB300 NVL72 - Blackwell Ultra Rack - where the same Blackwell architecture goes at datacenter scale

NVIDIA RTX Spark - ARM Blackwell Superchip for AI PCs

Overview

Key Specifications

Performance Benchmarks

Key Capabilities

NVLink-C2C Integration

FP4 Tensor Core Support

Local Reasoning with Long Context

Gaming and Content Creation

Pricing and Availability

Strengths

Weaknesses

Sources

Overview

Key Specifications

Performance Benchmarks

Key Capabilities

NVLink-C2C Integration

FP4 Tensor Core Support

Local Reasoning with Long Context

Gaming and Content Creation

Pricing and Availability

Strengths

Weaknesses

Related Coverage

Sources