OpenAI Releases GPT-Rosalind for Drug Discovery

OpenAI now has a model inside the drug lab. GPT-Rosalind, launched April 16, is the company's first purpose-built life sciences model - a direct move into territory where Google DeepMind's AlphaFold has held an unchallenged position for years.

TL;DR

GPT-Rosalind launched April 16 as a research preview for qualified US Enterprise customers
The model targets drug discovery, genomics, protein engineering, and chemistry workflows
BixBench score: 0.751 pass rate; beats GPT-5.4 on 6 of 11 LABBench2 tasks
Named after Rosalind Franklin, the crystallographer whose X-ray images helped reveal DNA's double helix structure
Partners with early access include Amgen, Moderna, Allen Institute, Thermo Fisher Scientific, and Dyno Therapeutics
A free Codex life sciences plugin connects researchers to over 50 scientific databases and tools

What GPT-Rosalind Actually Does

The model isn't a search tool dressed up in a lab coat. GPT-Rosalind is a reasoning model trained to handle the multi-step, evidence-heavy workflows that slow down pharmaceutical research: synthesizing literature across hundreds of papers, generating testable biological hypotheses, and designing experiments from scratch.

OpenAI describes it as built "to help researchers accelerate the early stages of discovery." In practice, that means a scientist can query the model, have it pull from curated databases, run through published research, and produce a structured experimental plan - tasks that, done manually, can stretch across weeks.

The Codex Plugin and 50+ Scientific Integrations

With the model itself, OpenAI is releasing a free Life Sciences plugin for Codex. The plugin connects to over 50 scientific tools and data sources, including genomic databases, protein structure repositories, and chemical compound libraries. Researchers can trigger these tools from within the model's agentic workflow without switching environments.

This is the more immediately practical piece for most labs. The plugin gives GPT-5.4-class reasoning access to specialized data sources that general-purpose models can't reliably query. Whether the model can exploit those integrations more intelligently than a well-prompted GPT-5.4 is a separate question.

How It Benchmarks

At Dyno Therapeutics, GPT-Rosalind's submissions ranked above the 95th percentile of human experts on RNA sequence-to-function prediction - using unpublished, uncontaminated sequences.

The headline number is from Dyno Therapeutics, a gene therapy company that gave GPT-Rosalind unpublished RNA sequences to predict function. The model's best-of-ten submissions placed above the 95th percentile of human expert predictions, and around the 84th percentile on sequence generation. Because these were sequences the model couldn't have seen during training, the result is harder to dismiss as memorization.

Standardized Benchmark Scores

Benchmark	GPT-Rosalind	Notes
BixBench	0.751 pass rate	Leading among published model scores; measures bioinformatics agent tasks
LABBench2	6 of 11 tasks beat GPT-5.4	Largest gain on CloningQA; covers literature retrieval, protocol design
Dyno Therapeutics RNA	>95th percentile humans	Prediction task; 84th percentile on sequence generation

BixBench covers real-world bioinformatics scenarios - multi-step analytical trajectories through biological datasets with open-answer questions. It's harder to game than multiple-choice evals, though it still measures agent performance in controlled conditions rather than actual drug discovery outcomes.

A laboratory microscope - the kind used in biology and chemistry research Biology microscope, the type of instrument GPT-Rosalind aims to work with. Source: commons.wikimedia.org

Who Gets Access - and Who Doesn't

Access is restricted. GPT-Rosalind is launching as a research preview for qualified Enterprise customers in the United States only. OpenAI hasn't announced a general availability timeline.

Partners With Early Access

Amgen: Sean Bruich, SVP of AI and Data, said the collaboration would "accelerate how we deliver medicines to patients."
Moderna: CEO Stéphane Bancel highlighted the model's ability to "reason across complex biological evidence" to translate insights into experimental workflows.
Allen Institute: CTO Andy Hickl emphasized the model makes "manual steps - like finding and aligning data - more consistent and repeatable in an agentic workflow."
Thermo Fisher Scientific: Scientific instrumentation and reagents provider.
Dyno Therapeutics: Gene therapy company that ran the RNA prediction evaluation.

The partner list is deliberately shaped. These aren't random enterprise clients - they're organizations that can validate GPT-Rosalind against real, unpublished biological data. That's meaningful for credibility. It also means the only published performance numbers come from evaluations that OpenAI selected and controlled.

Post-Preview Pricing

OpenAI has set post-preview pricing at $25 per million input tokens and $125 per million output tokens. No pricing is charged during the research preview phase.

Rosalind Franklin in 1955, photographed with her microscope at Birkbeck College, London Rosalind Franklin in 1955, the British chemist whose X-ray crystallography work was central to discovering the double-helix structure of DNA and after whom the model is named. Source: commons.wikimedia.org

What It Does Not Tell You

The benchmarks show a capable model. They don't show a model that can discover drugs.

BixBench and LABBench2 measure task performance on structured research questions. The Dyno Therapeutics RNA result is more compelling, but even ranking above the 95th percentile of human experts on a prediction task is not the same as advancing a compound to clinical trial. The gap between "useful research assistant" and "accelerates drug discovery" is the gap between GPT-Rosalind's benchmarks and the claims being made around them.

The comparison to Google DeepMind's AlphaFold also needs calibration. AlphaFold solved protein structure prediction - a specific, well-defined task where its output was verifiable against physical experiments. GPT-Rosalind is a general-purpose reasoning model fine-tuned for a domain. It can potentially help across a broader range of research tasks, but it doesn't have AlphaFold's predictive precision for any single problem.

There's also the access constraint. Restricting access to US-based Enterprise customers means the model can't be assessed independently. The only performance data available comes from OpenAI's own publications and partner testimonials. That's not unusual for a research preview, but it makes independent verification impossible for now.

OpenAI's competitive context is visible across the launch. The Novo Nordisk partnership signed just two days earlier covered drug discovery, manufacturing, and supply chain. GPT-Rosalind provides the model-layer infrastructure for those commitments. In parallel, Isomorphic Labs' IsoDDE - effectively AlphaFold 4 for drug discovery - is running in parallel, also restricted and proprietary.

The race in scientific AI is narrowing to a few very well-funded players, each claiming proprietary access to results that can't be independently checked.

The Rosalind Franklin naming is apt in one uncomfortable way: Franklin's most important contribution, the X-ray image known as Photo 51, was used to confirm the DNA double helix structure without her credit until decades later. How much of GPT-Rosalind's output ends up in published research - and how it gets attributed - is a question the life sciences community is going to have to answer sooner rather than later.

Sources: