Name: GPT-Rosalind
Author: OpenAI

Overview

OpenAI launched GPT-Rosalind on April 16, 2026 as its first domain-specific reasoning model and the opening entry in a new Life Sciences series. It targets the multi-step work that dominates early drug discovery: literature synthesis, hypothesis generation, experimental design, and agentic analysis over genomics, protein engineering, and chemistry data.

TL;DR

Purpose-built reasoning model for biology, chemistry, and drug discovery workflows
Scores 0.751 on BixBench, ahead of GPT-5.4 (0.732) and Gemini 3.1 Pro (0.550)
Gated US-only research preview, free to qualified enterprise customers, with pricing undisclosed

OpenAI describes GPT-Rosalind as a reasoning model trained from the architecture up for biology rather than a fine-tune wrapper on GPT-5.4. A free Codex Life Sciences plugin exposes more than 50 scientific databases and tools to the agent loop. The launch came two days after OpenAI's strategic AI alliance with Novo Nordisk, and the same quarter as Isomorphic Labs' proprietary IsoDDE pipeline and Anthropic's Coefficient Bio acquisition. Access is the real story: the model runs only inside a Trusted Access Program for qualified US enterprise research teams.

Key Specifications

Specification	Details
Provider	OpenAI
Model Family	GPT Life Sciences (series debut)
Architecture	Reasoning model (not disclosed publicly)
Parameters	Not disclosed
Context Window	Not disclosed
Input Modalities	Text
Output Modality	Text
Release Date	April 16, 2026
Availability	ChatGPT Enterprise, Codex, API (all gated)
License	Proprietary, Trusted Access Program
Pricing	Not disclosed; free during research preview
Codex Plugin	Free Life Sciences plugin, 50+ scientific data sources

Architecture, parameter count, and context window are all undisclosed at launch. That matches GPT-5.4 and GPT-5.3 practice but blocks independent architecture analysis.

Benchmark Performance

OpenAI published three benchmarks at launch. Only BixBench comes with cross-model numbers, and those are the ones worth focusing on.

Benchmark	GPT-Rosalind	GPT-5.4	GPT-5	Grok 4.2	Gemini 3.1 Pro
BixBench (Pass@1)	0.751	0.732	0.728	0.698	0.550
LABBench2 (tasks beat)	6 of 11 vs GPT-5.4	baseline	-	-	-
Dyno Therapeutics RNA prediction	>95th percentile of human experts	-	-	-	-
Dyno Therapeutics RNA generation	84th percentile of human experts	-	-	-	-

A scientist at a computer analysing genomic data, the kind of workflow GPT-Rosalind targets A bioinformatician analyzing genomic data. GPT-Rosalind's Codex plugin wires the model into 50+ of these pipelines so agents can run multi-step analyses without switching tools. Source: unsplash.com

BixBench, built by FutureHouse and maintained by Edison Scientific, hands an agent an empty Jupyter notebook and 53 real-world bioinformatics scenarios covering 296 questions. GPT-Rosalind's 0.751 pass rate is 1.9 points over GPT-5.4 and 20.1 points over Gemini 3.1 Pro. Clear lead, though the bar is open-answer agent performance in a notebook, not wet-lab output.

LABBench2 is broader. The 2026 update spans roughly 1,900 tasks across literature retrieval, database access, sequence manipulation, protocol troubleshooting, and experiment planning. OpenAI reports GPT-Rosalind beats GPT-5.4 on 6 of 11 task families, with the largest jump on CloningQA. Per-task scores aren't published.

The Dyno Therapeutics result is the most interesting and hardest to replicate. The gene therapy company gave the model unpublished RNA sequences that couldn't have appeared in training. GPT-Rosalind's best-of-ten submissions ranked above the 95th percentile of human experts on sequence-to-function prediction and around the 84th percentile on sequence generation. Dyno supplied the data, so it's harder to write off as contamination. It also isn't reproducible outside Dyno.

For the bigger picture, see our scientific reasoning LLM leaderboard and reasoning benchmarks leaderboard.

Key Capabilities

Biological reasoning. GPT-Rosalind is tuned for multi-step inference across molecules, proteins, genes, and disease-relevant biology. Workflows include target discovery and validation, genomics interpretation, pathway analysis, literature synthesis, and hypothesis generation. OpenAI positions the model for long-horizon tool-heavy tasks, not short conversational exchanges.

Codex Life Sciences plugin. The free plugin is the more broadly useful piece of the launch. It connects Codex to 50+ scientific tools and data sources covering human genetics, functional genomics, protein structure, biochemistry, clinical evidence, and public study discovery. Critically, it works with general-purpose models like GPT-5.4 too, which matters because most researchers will never qualify for the Trusted Access Program.

Agentic workflows. Allen Institute CTO Andy Hickl says the model makes "manual steps like finding and aligning data more consistent and repeatable in an agentic workflow." Literature reads, database queries, sequence analyses, and protocol drafts run inside one Codex session instead of across many tools.

Safety training. OpenAI included biosecurity refusal training and a governance review in the qualification flow, citing dual-use pathogen design concerns. Organizations must show legitimate research purposes and strong internal controls before provisioning.

Pricing and Availability

GPT-Rosalind is a US-only research preview. No self-serve onboarding, no developer playground, no published token price. Qualified enterprise customers use the model during preview without consuming ChatGPT Enterprise credits or paid API tokens. OpenAI says it will publish pricing and broader availability "as the program expands" without committing to a date.

Access Tier	Availability	Cost
Trusted Access Program (API)	Qualified US enterprise research teams	Free during preview
ChatGPT Enterprise	Same qualified customers	Free during preview
Codex (model)	Same qualified customers	Free during preview
Codex Life Sciences plugin	Public, works with any Codex model	Free
General API / ChatGPT Plus	Not available	Not available
International access	Not available	Not available

A technician handling sample vials during DNA genotyping and sequencing DNA genotyping at a cancer genomics laboratory. OpenAI's qualification process restricts GPT-Rosalind to research teams with similar institutional governance and biosecurity controls. Source: unsplash.com

Launch partners named across coverage: Amgen, Moderna, Thermo Fisher Scientific, the Allen Institute, Oracle Health and Life Sciences, NVIDIA, Benchling, UCSF School of Pharmacy, Los Alamos National Laboratory, and Dyno Therapeutics. Each had early access ahead of the announcement, which is how the Dyno RNA evaluation and partner quotes were produced.

"GPT-Rosalind represents an important step in helping scientific teams use advanced AI to reason across complex biological evidence, data, and workflows," said Moderna CEO Stéphane Bancel.

The free plugin is the practical contrast. Any Codex user can install the Life Sciences plugin today, point it at GPT-5.4 or GPT-5.3 Codex, and get programmatic access to the same 50+ databases. For labs outside the partner set, that's the real shipped product.

Strengths

Top BixBench score. 0.751 Pass@1 leads GPT-5.4 (0.732) and crushes Gemini 3.1 Pro (0.550) on agentic bioinformatics tasks
Dyno RNA result is contamination-resistant. The 95th-percentile prediction score used unpublished sequences, which is rare among vendor-published biology benchmarks
Purpose-built for long-horizon workflows. Literature synthesis, hypothesis generation, and experiment planning chain inside one Codex session
Codex Life Sciences plugin ships free. 50+ scientific data sources, usable with general-purpose models, not gated
Partner lineup validates the target. Amgen, Moderna, Thermo Fisher, Allen Institute, and Los Alamos are serious research shops, not marquee logos
Biosecurity framing is explicit. Governance review and refusal training are documented parts of the access flow

Weaknesses

Gated US-only research preview. No self-serve access, no international availability, and no GA timeline
Pricing undisclosed. Budgeting around GPT-Rosalind is impossible today
Parameters, architecture, context window all undisclosed. Independent architecture analysis isn't possible
LABBench2 reporting is thin. 6-of-11 task families beat GPT-5.4 is the headline, with no per-task scores
All public benchmarks are vendor-selected. Independent verification isn't possible outside OpenAI and its partner environment
"Drug discovery" claims outrun the evidence. Ranking above human experts on an RNA prediction task isn't the same as advancing a molecule to the clinic

OpenAI Releases GPT-Rosalind for Drug Discovery - Launch day news
Novo Nordisk - OpenAI strategic AI alliance - Pharma deal signed two days before launch
Isomorphic Labs IsoDDE - Closest comparable from Google DeepMind
Anthropic acquires Coefficient Bio - Third frontier lab targeting the same customers
GPT-5.4 - The general-purpose model GPT-Rosalind is benchmarked against
Scientific Reasoning LLM Leaderboard - Cross-model science benchmarks