OpenRouter Drops a Free 100B Stealth Model With 256K Context

TL;DR

Elephant Alpha is a new 100B parameter model on OpenRouter - currently completely free ($0 input, $0 output)
256K context window, 32K max output, with function calling, structured output, prompt caching, and tool use
OpenRouter says it's from "a prominent open model lab" but won't say which - the identity is deliberately hidden
The trade-off: all prompts and completions are logged and used to improve the model - this is a data collection play as much as a free offering
No published benchmarks despite claims of "matching SOTA performance of similar scale"

OpenRouter announced Elephant Alpha on April 13 - a 100B parameter "stealth model" available at zero cost. The pitch: strong reasoning, code completion, document processing, and lightweight agent workflows, all while being "extremely token efficient."

What you get

The specs are genuinely competitive for a free model:

Spec	Value
Parameters	100B
Context window	256,144 tokens
Max output	32,768 tokens
Input price	$0.00 / million tokens
Output price	$0.00 / million tokens
Rate limit	100 requests/minute

Feature support is broad: function calling, structured output, prompt caching, streaming, and tool use all work. That's a more complete feature set than most free-tier models offer.

OpenRouter positions it as an "instant" model - optimized for speed over deep reasoning. The target use cases are rapid code completion, large document processing, and iterative agent tasks where latency matters more than maximum intelligence.

What you give up

Privacy. The model card states: "Prompts and completions may be logged by the provider and used to improve the model." This isn't optional. If you're sending proprietary code, client data, or anything sensitive through Elephant Alpha, you're contributing to someone's training dataset.

Transparency. OpenRouter describes it as coming from "a prominent open model lab" but won't name the source. At 100B parameters, the candidates are limited - Llama 4 Scout (109B MoE), a Qwen variant, something from Cohere or Mistral. The community hasn't reached consensus yet.

Verified performance. OpenRouter claims it matches "SOTA performance of similar scale" but publishes no benchmark numbers. Zero. Not a single score on any eval. For a model asking you to trust it with your prompts, that's a notable absence.

Reasoning depth. No chain-of-thought or extended thinking mode. This is a fast, shallow model - useful for high-throughput tasks, less so for complex multi-step reasoning.

The stealth model pattern

OpenRouter has done this before. Their previous stealth drop, Hunter Alpha (a 1 trillion parameter model from March 2026), was widely assumed to be DeepSeek V4 testing in public. It turned out to be Xiaomi. Before that, Giga Potato served a similar role.

The pattern is consistent: drop an anonymous model at zero cost, collect usage data and community feedback, then unmask it once there's enough signal. The free pricing is temporary. The data collection is the point.

When to use it

If you need a free model with long context and tool use for non-sensitive work - prototyping agents, processing public documents, testing function calling pipelines - Elephant Alpha is hard to beat on price. 256K context with structured output at zero cost doesn't exist elsewhere.

If you care about knowing whose model is running your code, or you can't have your prompts logged, look elsewhere.

Sources: