Grok 4.5

Grok 4.5 is xAI's 1.5-trillion-parameter V9 model in private beta at SpaceX and Tesla, with supplemental training on Cursor coding data and early evals claiming performance near Claude Opus 4.8.

Grok 4.5

Grok 4.5 is xAI's latest model, built on the V9 foundation architecture and carrying 1.5 trillion parameters - roughly three times the scale of the V8-small architecture that underpinned earlier Grok 4 variants. On June 28, 2026, Elon Musk announced on X that the model had entered private beta at SpaceX and Tesla, with supplemental training on data from Cursor, the AI coding assistant that SpaceX agreed to acquire for $60 billion in June 2026.

TL;DR

  • 1.5T parameters on the V9 foundation model - three times the scale of Grok 4.20's V8-small architecture
  • Private beta only as of June 29, 2026; no public API, no pricing, no context window disclosed
  • Musk's own claim: early evals "close to, perhaps exceeding Opus" - no independent benchmarks yet

The V9 completed its primary training run on May 26, 2026. Grok 4.5 is the first V9-based model to go into any kind of external deployment, though "external" is relative - the beta is restricted to engineers at two companies Musk controls. xAI is running the private beta as an active reinforcement learning environment: real engineering workflows at SpaceX (rocket path calculations, software tooling) and Tesla (vehicle manufacturing software) feed objective pass/fail signals back into ongoing RL training. Code either runs or it doesn't - which gives the Grok Build harness cleaner training signal than subjective human preference ratings.

The predecessor in public deployment is Grok 4.3, which launched April 17, 2026, at 0.5T parameters with native video input and document generation. Before that was Grok 4.20, which set the 2M-token context window record among Western closed models. Grok 4.5 represents a distinct jump in scale rather than an incremental feature update.

Key Specifications

SpecificationDetails
ProviderxAI (SpaceX AI division)
Model FamilyGrok
FoundationV9
Parameters1.5 trillion
Context WindowNot disclosed
Input PriceNot disclosed
Output PriceNot disclosed
Release DateJune 28, 2026 (private beta)
Public APINot available
Open SourceNo
LicenseProprietary

xAI hasn't disclosed context window size, pricing, or API model IDs for Grok 4.5. The current public flagship on the xAI API is Grok 4.3, priced at $1.25/M input and $2.50/M output with a 1M-token context window. Whether Grok 4.5 will carry a price premium over 4.3, or expand context beyond 1M, hasn't been announced.

Benchmark Performance

No independent benchmarks have been published for Grok 4.5 as of June 29, 2026. The only performance claim on the record is Musk's own post on X: "Early evals show performance close to, perhaps exceeding Opus." This refers to Claude Opus 4.8, which Anthropic released May 28, 2026, and which currently sits at the top of several public leaderboards.

BenchmarkGrok 4.5Grok 4.3Claude Opus 4.8Grok 4.20
MMLU-ProNot releasedNot reportedNot reportedNot reported
SWE-bench VerifiedNot releasedNot reported88.6%78%
GPQA DiamondNot releasedNot reportedNot reportedNot reported
Chatbot Arena ELONot released~1500 (GDPval-AA)~1890 (GDPval-AA)~1493
Artificial Analysis IndexNot released53Not listed48

Musk's framing is deliberately hedged ("close to, perhaps exceeding") and the comparison is to an internal eval set, not to any published third-party benchmark. For context on what "exceeding Opus 4.8" would mean: Opus 4.8 scores 88.6% on SWE-bench Verified and 96.7% on USAMO 2026. Those are the targets Grok 4.5 is implicitly being compared against, and that's a high bar.

The Cursor training data gives a specific reason to watch coding benchmarks. Cursor isn't just a code repository - it captures real debugging sessions, multi-file edits, and iterative developer workflows. That's different from training on code corpora like The Stack or GitHub repos. Whether that translates to meaningfully better SWE-bench scores remains to be seen when independent numbers arrive.

Code on a dark terminal screen showing software development workflow Grok 4.5 underwent supplemental training on Cursor session data - real developer workflows rather than static code repositories. Source: unsplash.com

Key Capabilities

V9 Architecture and Scale

The V9 is a ground-up architectural redesign, not a fine-tune of earlier Grok variants. At 1.5T parameters it sits well above the V8-small (0.5T, used in Grok 4.3's early release), and the roadmap MindStudio reported in May 2026 showed xAI training two 1.5T variants simultaneously with a 6T and a 10T run on the Colossus 2 cluster. Grok 4.5 is the first 1.5T V9 model to reach any deployment phase.

The scale increase matters for coding in particular. Larger parameter counts generally correlate with better multi-file reasoning, keeping complex codebases in context without retrieval, and handling the kind of interdependent refactoring that trips up smaller models.

Cursor Training Data

SpaceX agreed to acquire Cursor (developed by Anysphere) for $60 billion in stock in a deal announced June 16, 2026, expected to close in Q3 2026. Cursor had crossed $1 billion in annualized revenue as of November 2025, making it one of the fastest-growing developer tools in recent memory. The supplemental training added session data - actual developer interactions with Cursor's editor, including debugging traces, code diffs, and user corrections.

One important caveat from the buildfastwithai analysis: adding Cursor data in supplemental training is "not quite as good as having it in initial training." For Grok 5 and later V9 variants, xAI may integrate this data earlier in the pipeline. For now, it's a post-pretraining injection.

Grok Build and Live RL

The Grok Build harness is xAI's internal infrastructure for running AI-produced code autonomously, assessing it against test suites, and feeding pass/fail results back as training signal. Musk noted in the June 28 announcement that "RL is continuing to significantly improve the model, and the Grok Build harness gets better every day." This is a real capability advantage: the private beta isn't just a quality assurance phase, it's an active training environment. SpaceX rocket software and Tesla vehicle manufacturing pipelines provide harder and more varied real-world coding tests than synthetic benchmarks usually do.

Code editor with multiple open files showing AI-assisted development Cursor's session data - debugger interactions, multi-file diffs, developer corrections - gives xAI training signal that static code corpora lack. Source: unsplash.com

Pricing and Availability

Grok 4.5 isn't publicly available as of June 29, 2026. There is no API, no consumer access, no pricing, and no announced public release date. The model is running exclusively at SpaceX and Tesla.

The current xAI API lineup for comparison:

ModelInput $/MOutput $/MContext
Grok 4.3 (current flagship)$1.25$2.501M tokens
Grok 4.20$2.00$6.002M tokens (legacy)
Grok Build 0.1$1.00$2.00256K tokens
Claude Opus 4.8$5.00$25.001M tokens
GPT-5.5Not confirmedNot confirmedNot confirmed

When Grok 4.5 reaches public API, pricing is likely to land somewhere above Grok 4.3 given the scale increase, but xAI has consistently priced below Anthropic. The SuperGrok Heavy subscription tier ($300/month) is the most probable first consumer access point, given that's how Grok 4.3 launched.

xAI has announced plans to release entirely new models trained from scratch at SpaceX on a monthly cadence through the end of 2026. Grok 4.5 is the June entry in that sequence. The July model would be the next V9 variant, presumably at 1.5T with further RL improvements, or potentially the first 6T V9 run.

Strengths

  • 1.5T parameter scale - largest V9 model in any deployment as of June 2026
  • Cursor training data adds real developer workflow signal beyond static code repos
  • Live RL from SpaceX and Tesla engineering tasks provides objective feedback
  • xAI's track record of pricing below Anthropic and OpenAI on flagship models
  • Monthly release cadence means improvements to the V9 series will arrive quickly

Weaknesses

  • No public access, no API, no pricing, no context window disclosed
  • No independent benchmarks published - only Musk's own internal eval claims
  • Cursor data added in supplemental training, not pre-training (less thorough integration)
  • Model is still actively training; what ships publicly may differ from current private beta state
  • xAI's announced release schedules have slipped before (4.5 was originally targeted for late May)

FAQ

What is Grok 4.5?

Grok 4.5 is xAI's 1.5-trillion-parameter language model built on the V9 foundation architecture, announced June 28, 2026. It's in private beta at SpaceX and Tesla, with no public API or consumer access yet.

How does Grok 4.5 compare to Claude Opus 4.8?

Elon Musk claims internal evals show performance "close to, perhaps exceeding" Opus 4.8. No independent benchmarks have been published. Claude Opus 4.8 scores 88.6% on SWE-bench Verified and leads most public coding leaderboards.

What is the Cursor connection?

SpaceX agreed to acquire AI coding assistant Cursor for $60 billion in June 2026. xAI incorporated Cursor session data - real developer workflows, debugger interactions, and code diffs - into Grok 4.5's supplemental training to improve its coding performance.

When will Grok 4.5 be publicly available?

No public release date has been announced. xAI's pattern with Grok 4.3 was private beta first for SuperGrok Heavy subscribers, then broader API access weeks later. A July 2026 public launch seems plausible but isn't confirmed.

What is the Grok Build harness?

Grok Build is xAI's internal infrastructure for running AI-generated code autonomously and evaluating it against test suites. The pass/fail results feed directly into ongoing reinforcement learning, providing deterministic training signal rather than subjective preference ratings.

Sources:

✓ Last verified June 29, 2026

James Kowalski
About the author AI Benchmarks & Tools Analyst

James is a software engineer turned tech writer who spent six years building backend systems at a fintech startup in Chicago before pivoting to full-time analysis of AI tools and infrastructure.