GPU Cloud Pricing Comparison - March 2026
Current GPU cloud pricing from 19 providers compared - H100 from $1.25/hr (spot) to $6.98/hr, A100 from $0.29/hr, H200, B200, RTX 4090, with monthly costs, user reviews, and reliability data.

TL;DR
- Cheapest H100: Vast.ai at ~$1.38/hr on-demand, RunPod spot at ~$1.25/hr
- Cheapest A100: Vast.ai at ~$0.29/hr, Azure spot at ~$0.74/hr
- Cheapest RTX 4090: Salad at $0.204/hr (distributed consumer GPUs)
- Best overall value: RunPod (balance of price, reliability, and ease of use)
- Hyperscalers (AWS, GCP, Azure) charge 40-100% more but offer enterprise compliance
- H100 market average dropped to ~$3.13/hr - down from $4-5/hr in early 2025
We tracked current GPU pricing across 19 providers as of late March 2026. Prices have dropped significantly since our last update - AWS cut H100 pricing 44% in June 2025, which pressured the entire market. Neo-cloud providers now offer H100s for 40-85% less than hyperscalers.
H100 80GB SXM Pricing
The most in-demand GPU for AI training and inference. Sorted by on-demand price.
| Provider | On-Demand/hr | Spot/hr | Reserved/hr | Monthly (on-demand) | Min Commit |
|---|---|---|---|---|---|
| RunPod (Spot) | - | $1.25 | - | - | None |
| Vast.ai | $1.38-1.87 | ~$1.00+ | 50% off (3-6mo) | ~$1,000-1,350 | None |
| Thunder Compute | $1.38 | N/A | N/A | ~$1,000 | None |
| RunPod (Community) | $1.99 | $1.25 | N/A | ~$1,433 | None |
| FluidStack | $2.10 | N/A | Contact | ~$1,512 | None |
| RunPod (Secure) | $2.39-2.69 | Available | N/A | ~$1,721-1,937 | None |
| Jarvislabs | $2.69 | N/A | Custom (25+) | ~$1,937 | None |
| Nebius | $2.95 | N/A | $2.00 | ~$2,124 | None |
| Vultr | $2.99 | N/A | $2.30 (36mo) | ~$2,153 | None |
| Lambda | $3.29-4.29 | N/A | Contact (1-3yr) | ~$2,369-3,089 | None |
| Together AI | $3.49-3.99 | N/A | $2.25-2.69 | ~$2,513 | None |
| Crusoe | $3.90 | Contact | Contact | ~$2,808 | None |
| AWS (p5.48xlarge) | ~$3.90 | ~$2.50 | $1.90-2.10 (1-3yr) | ~$2,808 | None |
| GCP (A3-highgpu) | ~$3.00 | ~$2.25 | Auto-SUD | ~$2,160 | None |
| Modal | ~$3.95 | N/A | N/A | ~$2,844 | None |
| Paperspace | $5.95 | N/A | $2.24 (3yr) | ~$4,284 | None |
| CoreWeave | ~$6.16 | N/A | ~$1.45 (reserved) | ~$4,435 | Enterprise |
| Azure (ND H100 v5) | ~$6.98 | ~$3.50 | 1/3yr reserved | ~$5,026 | None |
| Oracle Cloud | ~$10.00 | N/A | Contact | ~$7,200 | 8-GPU min |
A100 80GB Pricing
Still widely available and excellent value for inference and smaller training runs.
| Provider | On-Demand/hr | Spot/hr | Monthly (on-demand) |
|---|---|---|---|
| Vast.ai | $0.29-1.50 | Marketplace | ~$209-1,080 |
| Azure (Spot) | - | $0.74 | - |
| Crusoe (Spot) | - | $1.20-1.30 | - |
| Paperspace (3yr) | $1.15 | N/A | ~$828 |
| Jarvislabs | $1.49 | N/A | ~$1,073 |
| Vultr (36mo) | $1.49 | N/A | ~$1,073 |
| RunPod | ~$1.64 | Lower | ~$1,181 |
| Crusoe | $1.65-1.95 | $1.20-1.30 | ~$1,188-1,404 |
| Modal | ~$2.50 | N/A | ~$1,800 |
| CoreWeave | ~$2.70 | N/A | ~$1,944 |
| Lambda | $2.79 | N/A | ~$2,009 |
| Vultr (on-demand) | $2.80 | N/A | ~$2,016 |
| AWS (p4de) | ~$3.43 | ~$3.07 | ~$2,470 |
| Azure | ~$3.67 | ~$0.74 | ~$2,642 |
| GCP | ~$5.78 | ~$2.51 | ~$4,162 |
H200 141GB Pricing
The newest high-end option. Limited availability across providers.
| Provider | On-Demand/hr | Reserved/hr | Monthly (on-demand) |
|---|---|---|---|
| Nebius (committed) | $3.50 | $2.30 | ~$2,520 |
| Together AI (reserved) | $4.19 | $2.59-3.19 | ~$3,017 |
| Jarvislabs | $3.80 | Custom | ~$2,736 |
| Crusoe | $4.29 | Contact | ~$3,089 |
| RunPod | $4.31 | N/A | ~$3,103 |
| Modal | ~$4.54 | N/A | ~$3,269 |
| CoreWeave | ~$6.31 | Up to 60% off | ~$4,543 |
RTX 4090 24GB Pricing
The sweet spot for inference and small-model fine-tuning.
| Provider | On-Demand/hr | Monthly |
|---|---|---|
| Salad | $0.204 | ~$149 |
| Vast.ai | $0.35-0.55 | ~$252-396 |
| RunPod | ~$0.34 | ~$245 |
| Spheron | $0.58 | ~$418 |
| Buy your own | - | ~$1,800 (one-time) |
At $0.204/hr, Salad's distributed consumer GPU network breaks even against buying your own RTX 4090 in about 12 months of continuous use. If you need the GPU less than full-time, Salad or Vast.ai are cheaper than owning.
Provider Profiles and User Reviews
Tier 1: Best Value (Budget-Friendly, Good Reliability)
RunPod - The default recommendation for most teams. Per-minute billing, instant GPU availability for most types, templates for quick environment setup. Community Cloud is cheaper but less reliable than Secure Cloud. Trustpilot: 204 reviews, positive. Users praise speed, pricing, and 15-minute support response times.
Vast.ai - The cheapest option period, but buyer beware. P2P marketplace means prices and reliability vary by host. Instances can disappear mid-job. Per-second billing, 68+ GPU types. Trustpilot: 4.4/5 (209 reviews). Great for batch processing and experiments that can tolerate interruptions.
Jarvislabs - Under-the-radar but competitive, especially for A100 ($1.49/hr) and H100 ($2.69/hr). Per-minute billing, spin-up in under 90 seconds, managed Jupyter workbenches. Good for quick experiments.
Tier 2: Mid-Range (More Features, Moderate Premium)
Lambda - Simplest UX in the market. Pre-installed PyTorch, 1-Click Clusters with InfiniBand. H100 at $3.29-4.29/hr. No spot pricing. Stock-outs are common for H100s. Best for researchers who want zero setup friction.
Nebius - Competitive committed pricing: H100 at $2.00/hr, H200 at $2.30/hr with multi-month agreements. Up to 35% savings on reservations. Newer provider with less community feedback but strong NVIDIA partnership.
Vultr - Straightforward pricing with rare AMD GPU options (MI300X at $1.85, MI355X at $2.59). H100 at $2.99/hr on-demand, $2.30 with 36-month prepaid. Good for teams wanting AMD hardware.
Together AI - Dual offering: serverless inference APIs and dedicated GPU clusters. H100 clusters at $3.49/hr with reserved options down to $2.25/hr. Best if you need both training compute and inference APIs from one provider.
Crusoe - 100% clean energy powered. Competitive A100 spot ($1.20/hr) and unique in offering AMD MI300X ($3.45/hr). No egress charges. Per-minute billing.
Tier 3: Enterprise / Specialized
CoreWeave - Kubernetes-native, InfiniBand networking, supports 256+ GPU clusters. Reserved H100 at ~$1.45/hr (up to 60% off on-demand). Requires enterprise approval and multi-day onboarding. Best for funded startups training foundation models.
Modal - Serverless GPU compute with per-second billing and auto-scale to zero. H100 at ~$3.95/hr. Developer experience is extraordinary (Python SDK, cold starts <15s). No SSH access, Python-only. Best for inference APIs and scheduled batch jobs.
Paperspace (DigitalOcean) - Free GPU tier with 12-hour auto-shutdown. H100 at $5.95/hr on-demand, $2.24/hr with 3-year commit. Best for beginners and prototyping.
Salad - 60,000+ consumer GPUs from gamers. RTX 4090 at $0.204/hr. No data center GPUs. Best for inference workloads tolerant of consumer-grade reliability.
Tier 4: Hyperscalers (Enterprise Compliance Premium)
| Hyperscaler | H100/hr | Premium vs. RunPod | Best For |
|---|---|---|---|
| GCP | ~$3.00 | +50% | Vertex AI integration, TPU alternative |
| AWS | ~$3.90 | +95% | SageMaker, global compliance (HIPAA/FedRAMP) |
| Azure | ~$6.98 | +250% | Microsoft ecosystem, excellent A100 spot ($0.74) |
| Oracle | ~$10.00 | +400% | NOT recommended for GPU cost optimization |
The hyperscaler premium buys you enterprise SLAs, managed Kubernetes, compliance certifications (HIPAA, SOC2, FedRAMP), and global region availability. If you don't need those, you're overpaying by 50-400%.
Hidden Costs
Engineering Time
Setting up vLLM, configuring load balancing, managing GPU failures, and tuning batch sizes. Budget 20-40 hours of engineer time for initial setup and 5-10 hours/month ongoing. At $75-150/hour fully loaded, that's $2,000-6,000 in the first month.
Idle GPU Costs
Cloud GPUs charge by the hour (or minute) whether you're using them or not. If your workload runs 8 hours/day, you pay for 24. Auto-scaling and spot instances help but add complexity. Modal's serverless model (scale to zero) removes this entirely.
Networking and Egress
Cloud providers charge $0.05-0.12/GB for data egress. Lambda, RunPod, and Crusoe are more generous. AWS and GCP egress costs can add 10-20% to your effective GPU cost.
Storage
GPU cloud storage ranges from $0.10-0.20/GB/month. Model weights for a 70B model are ~35 GB. For fine-tuning with checkpoints, budget 200-500 GB.
Price History
| Event | Date | Impact |
|---|---|---|
| AWS cuts H100 44% | June 2025 | Pressured entire market downward |
| RunPod launches H200 | Q1 2026 | $4.31/hr, first mainstream H200 availability |
| Vast.ai H100 drops below $1.50 | Feb 2026 | New low for on-demand H100 |
| Nebius launches committed pricing | Jan 2026 | H100 at $2.00/hr (committed) |
| B200 pricing emerges | Mar 2026 | $4.99-9.95/hr across early providers |
The trend is clear: H100 prices are falling as supply increases and B200/GB200 begin entering the market. Expect further drops in Q2-Q3 2026.
FAQ
What's the cheapest H100 cloud GPU?
Vast.ai offers H100 80GB from ~$1.38/hr on their P2P marketplace. RunPod spot pricing starts at ~$1.25/hr. For reliable on-demand, RunPod Community Cloud at ~$1.99/hr is the best value considering uptime.
Should I use spot/preemptible GPUs?
For training with checkpointing, yes - save 30-70%. For inference serving, no - your users will experience outages when instances are reclaimed. Use on-demand or reserved for production inference.
Which provider has the best reliability?
Lambda and RunPod Secure Cloud are the most praised for uptime. Vast.ai is the least reliable (P2P marketplace, hosts can reclaim machines). CoreWeave is enterprise-grade but requires Kubernetes expertise.
Is it cheaper to buy my own GPU?
A RTX 4090 costs ~$1,800. At Salad's $0.204/hr, it takes ~8,800 hours (~12 months 24/7) to break even. If you need the GPU less than full-time, renting is cheaper. For 24/7 production workloads, buying makes sense after ~1 year.
When should I use a hyperscaler vs. a specialist?
Use AWS/GCP/Azure when you need enterprise compliance (HIPAA, SOC2, FedRAMP), managed Kubernetes, or global region availability. Use RunPod/Lambda/Vast.ai for everything else - you'll save 40-85%.
Are AMD GPUs worth considering?
Vultr and Crusoe offer MI300X at $1.85-3.45/hr. Software compatibility (ROCm vs. CUDA) is the main concern. If your framework supports AMD (PyTorch does natively since 2.0), the price-performance can be excellent.
Sources:
- RunPod GPU Pricing
- Lambda AI Cloud Pricing
- Vast.ai Marketplace
- Jarvislabs GPU Pricing
- CoreWeave Pricing
- Together AI Pricing
- Modal Pricing
- Paperspace Pricing
- Salad Pricing
- Nebius GPU Pricing
- Crusoe Cloud Pricing
- Vultr Cloud GPU Pricing
- AWS EC2 GPU Pricing
- GCP GPU Pricing
- Azure GPU VM Pricing
- Oracle Cloud Compute Pricing
- H100 Cloud Pricing Comparison - getdeploying.com
- Cheapest Cloud GPU Providers 2026 - Northflank
Last updated
✓ Last verified March 26, 2026
