GPU Cloud Pricing Comparison - March 2026

TL;DR

Cheapest H100: Vast.ai at ~$1.38/hr on-demand, RunPod spot at ~$1.25/hr
Cheapest A100: Vast.ai at ~$0.29/hr, Azure spot at ~$0.74/hr
Cheapest RTX 4090: Salad at $0.204/hr (distributed consumer GPUs)
Best overall value: RunPod (balance of price, reliability, and ease of use)
Hyperscalers (AWS, GCP, Azure) charge 40-100% more but offer enterprise compliance
H100 market average dropped to ~$3.13/hr - down from $4-5/hr in early 2025

We tracked current GPU pricing across 19 providers as of late March 2026. Prices have dropped significantly since our last update - AWS cut H100 pricing 44% in June 2025, which pressured the entire market. Neo-cloud providers now offer H100s for 40-85% less than hyperscalers.

H100 80GB SXM Pricing

The most in-demand GPU for AI training and inference. Sorted by on-demand price.

Provider	On-Demand/hr	Spot/hr	Reserved/hr	Monthly (on-demand)	Min Commit
RunPod (Spot)	-	$1.25	-	-	None
Vast.ai	$1.38-1.87	~$1.00+	50% off (3-6mo)	~$1,000-1,350	None
Thunder Compute	$1.38	N/A	N/A	~$1,000	None
RunPod (Community)	$1.99	$1.25	N/A	~$1,433	None
FluidStack	$2.10	N/A	Contact	~$1,512	None
RunPod (Secure)	$2.39-2.69	Available	N/A	~$1,721-1,937	None
Jarvislabs	$2.69	N/A	Custom (25+)	~$1,937	None
Nebius	$2.95	N/A	$2.00	~$2,124	None
Vultr	$2.99	N/A	$2.30 (36mo)	~$2,153	None
Lambda	$3.29-4.29	N/A	Contact (1-3yr)	~$2,369-3,089	None
Together AI	$3.49-3.99	N/A	$2.25-2.69	~$2,513	None
Crusoe	$3.90	Contact	Contact	~$2,808	None
AWS (p5.48xlarge)	~$3.90	~$2.50	$1.90-2.10 (1-3yr)	~$2,808	None
GCP (A3-highgpu)	~$3.00	~$2.25	Auto-SUD	~$2,160	None
Modal	~$3.95	N/A	N/A	~$2,844	None
Paperspace	$5.95	N/A	$2.24 (3yr)	~$4,284	None
CoreWeave	~$6.16	N/A	~$1.45 (reserved)	~$4,435	Enterprise
Azure (ND H100 v5)	~$6.98	~$3.50	1/3yr reserved	~$5,026	None
Oracle Cloud	~$10.00	N/A	Contact	~$7,200	8-GPU min

A100 80GB Pricing

Still widely available and excellent value for inference and smaller training runs.

Provider	On-Demand/hr	Spot/hr	Monthly (on-demand)
Vast.ai	$0.29-1.50	Marketplace	~$209-1,080
Azure (Spot)	-	$0.74	-
Crusoe (Spot)	-	$1.20-1.30	-
Paperspace (3yr)	$1.15	N/A	~$828
Jarvislabs	$1.49	N/A	~$1,073
Vultr (36mo)	$1.49	N/A	~$1,073
RunPod	~$1.64	Lower	~$1,181
Crusoe	$1.65-1.95	$1.20-1.30	~$1,188-1,404
Modal	~$2.50	N/A	~$1,800
CoreWeave	~$2.70	N/A	~$1,944
Lambda	$2.79	N/A	~$2,009
Vultr (on-demand)	$2.80	N/A	~$2,016
AWS (p4de)	~$3.43	~$3.07	~$2,470
Azure	~$3.67	~$0.74	~$2,642
GCP	~$5.78	~$2.51	~$4,162

H200 141GB Pricing

The newest high-end option. Limited availability across providers.

Provider	On-Demand/hr	Reserved/hr	Monthly (on-demand)
Nebius (committed)	$3.50	$2.30	~$2,520
Together AI (reserved)	$4.19	$2.59-3.19	~$3,017
Jarvislabs	$3.80	Custom	~$2,736
Crusoe	$4.29	Contact	~$3,089
RunPod	$4.31	N/A	~$3,103
Modal	~$4.54	N/A	~$3,269
CoreWeave	~$6.31	Up to 60% off	~$4,543

RTX 4090 24GB Pricing

The sweet spot for inference and small-model fine-tuning.

Provider	On-Demand/hr	Monthly
Salad	$0.204	~$149
Vast.ai	$0.35-0.55	~$252-396
RunPod	~$0.34	~$245
Spheron	$0.58	~$418
Buy your own	-	~$1,800 (one-time)

At $0.204/hr, Salad's distributed consumer GPU network breaks even against buying your own RTX 4090 in about 12 months of continuous use. If you need the GPU less than full-time, Salad or Vast.ai are cheaper than owning.

Provider Profiles and User Reviews

Tier 1: Best Value (Budget-Friendly, Good Reliability)

RunPod - The default recommendation for most teams. Per-minute billing, instant GPU availability for most types, templates for quick environment setup. Community Cloud is cheaper but less reliable than Secure Cloud. Trustpilot: 204 reviews, positive. Users praise speed, pricing, and 15-minute support response times.

Vast.ai - The cheapest option period, but buyer beware. P2P marketplace means prices and reliability vary by host. Instances can disappear mid-job. Per-second billing, 68+ GPU types. Trustpilot: 4.4/5 (209 reviews). Great for batch processing and experiments that can tolerate interruptions.

Jarvislabs - Under-the-radar but competitive, especially for A100 ($1.49/hr) and H100 ($2.69/hr). Per-minute billing, spin-up in under 90 seconds, managed Jupyter workbenches. Good for quick experiments.

Tier 2: Mid-Range (More Features, Moderate Premium)

Lambda - Simplest UX in the market. Pre-installed PyTorch, 1-Click Clusters with InfiniBand. H100 at $3.29-4.29/hr. No spot pricing. Stock-outs are common for H100s. Best for researchers who want zero setup friction.

Nebius - Competitive committed pricing: H100 at $2.00/hr, H200 at $2.30/hr with multi-month agreements. Up to 35% savings on reservations. Newer provider with less community feedback but strong NVIDIA partnership.

Vultr - Straightforward pricing with rare AMD GPU options (MI300X at $1.85, MI355X at $2.59). H100 at $2.99/hr on-demand, $2.30 with 36-month prepaid. Good for teams wanting AMD hardware.

Together AI - Dual offering: serverless inference APIs and dedicated GPU clusters. H100 clusters at $3.49/hr with reserved options down to $2.25/hr. Best if you need both training compute and inference APIs from one provider.

Crusoe - 100% clean energy powered. Competitive A100 spot ($1.20/hr) and unique in offering AMD MI300X ($3.45/hr). No egress charges. Per-minute billing.

Tier 3: Enterprise / Specialized

CoreWeave - Kubernetes-native, InfiniBand networking, supports 256+ GPU clusters. Reserved H100 at ~$1.45/hr (up to 60% off on-demand). Requires enterprise approval and multi-day onboarding. Best for funded startups training foundation models.

Modal - Serverless GPU compute with per-second billing and auto-scale to zero. H100 at ~$3.95/hr. Developer experience is extraordinary (Python SDK, cold starts <15s). No SSH access, Python-only. Best for inference APIs and scheduled batch jobs.

Paperspace (DigitalOcean) - Free GPU tier with 12-hour auto-shutdown. H100 at $5.95/hr on-demand, $2.24/hr with 3-year commit. Best for beginners and prototyping.

Salad - 60,000+ consumer GPUs from gamers. RTX 4090 at $0.204/hr. No data center GPUs. Best for inference workloads tolerant of consumer-grade reliability.

Tier 4: Hyperscalers (Enterprise Compliance Premium)

Hyperscaler	H100/hr	Premium vs. RunPod	Best For
GCP	~$3.00	+50%	Vertex AI integration, TPU alternative
AWS	~$3.90	+95%	SageMaker, global compliance (HIPAA/FedRAMP)
Azure	~$6.98	+250%	Microsoft ecosystem, excellent A100 spot ($0.74)
Oracle	~$10.00	+400%	NOT recommended for GPU cost optimization

The hyperscaler premium buys you enterprise SLAs, managed Kubernetes, compliance certifications (HIPAA, SOC2, FedRAMP), and global region availability. If you don't need those, you're overpaying by 50-400%.

Hidden Costs

Engineering Time

Setting up vLLM, configuring load balancing, managing GPU failures, and tuning batch sizes. Budget 20-40 hours of engineer time for initial setup and 5-10 hours/month ongoing. At $75-150/hour fully loaded, that's $2,000-6,000 in the first month.

Idle GPU Costs

Cloud GPUs charge by the hour (or minute) whether you're using them or not. If your workload runs 8 hours/day, you pay for 24. Auto-scaling and spot instances help but add complexity. Modal's serverless model (scale to zero) removes this entirely.

Networking and Egress

Cloud providers charge $0.05-0.12/GB for data egress. Lambda, RunPod, and Crusoe are more generous. AWS and GCP egress costs can add 10-20% to your effective GPU cost.

Storage

GPU cloud storage ranges from $0.10-0.20/GB/month. Model weights for a 70B model are ~35 GB. For fine-tuning with checkpoints, budget 200-500 GB.

Price History

Event	Date	Impact
AWS cuts H100 44%	June 2025	Pressured entire market downward
RunPod launches H200	Q1 2026	$4.31/hr, first mainstream H200 availability
Vast.ai H100 drops below $1.50	Feb 2026	New low for on-demand H100
Nebius launches committed pricing	Jan 2026	H100 at $2.00/hr (committed)
B200 pricing emerges	Mar 2026	$4.99-9.95/hr across early providers

The trend is clear: H100 prices are falling as supply increases and B200/GB200 begin entering the market. Expect further drops in Q2-Q3 2026.

FAQ

What's the cheapest H100 cloud GPU?

Vast.ai offers H100 80GB from ~$1.38/hr on their P2P marketplace. RunPod spot pricing starts at ~$1.25/hr. For reliable on-demand, RunPod Community Cloud at ~$1.99/hr is the best value considering uptime.

Should I use spot/preemptible GPUs?

For training with checkpointing, yes - save 30-70%. For inference serving, no - your users will experience outages when instances are reclaimed. Use on-demand or reserved for production inference.

Which provider has the best reliability?

Lambda and RunPod Secure Cloud are the most praised for uptime. Vast.ai is the least reliable (P2P marketplace, hosts can reclaim machines). CoreWeave is enterprise-grade but requires Kubernetes expertise.

Is it cheaper to buy my own GPU?

A RTX 4090 costs ~$1,800. At Salad's $0.204/hr, it takes ~8,800 hours (~12 months 24/7) to break even. If you need the GPU less than full-time, renting is cheaper. For 24/7 production workloads, buying makes sense after ~1 year.

When should I use a hyperscaler vs. a specialist?

Use AWS/GCP/Azure when you need enterprise compliance (HIPAA, SOC2, FedRAMP), managed Kubernetes, or global region availability. Use RunPod/Lambda/Vast.ai for everything else - you'll save 40-85%.

Are AMD GPUs worth considering?

Vultr and Crusoe offer MI300X at $1.85-3.45/hr. Software compatibility (ROCm vs. CUDA) is the main concern. If your framework supports AMD (PyTorch does natively since 2.0), the price-performance can be excellent.

Sources: