OpenRouter Drops a Free 100B Stealth Model With 256K Context
Elephant Alpha is a free 100B parameter model on OpenRouter with 256K context, tool use, and structured output - but your prompts get logged, there are no benchmarks, and nobody knows who made it.

TL;DR
- Elephant Alpha is a new 100B parameter model on OpenRouter - currently completely free ($0 input, $0 output)
- 256K context window, 32K max output, with function calling, structured output, prompt caching, and tool use
- OpenRouter says it's from "a prominent open model lab" but won't say which - the identity is deliberately hidden
- The trade-off: all prompts and completions are logged and used to improve the model - this is a data collection play as much as a free offering
- No published benchmarks despite claims of "matching SOTA performance of similar scale"
OpenRouter announced Elephant Alpha on April 13 - a 100B parameter "stealth model" available at zero cost. The pitch: strong reasoning, code completion, document processing, and lightweight agent workflows, all while being "extremely token efficient."
What you get
The specs are genuinely competitive for a free model:
| Spec | Value |
|---|---|
| Parameters | 100B |
| Context window | 256,144 tokens |
| Max output | 32,768 tokens |
| Input price | $0.00 / million tokens |
| Output price | $0.00 / million tokens |
| Rate limit | 100 requests/minute |
Feature support is broad: function calling, structured output, prompt caching, streaming, and tool use all work. That's a more complete feature set than most free-tier models offer.
OpenRouter positions it as an "instant" model - optimized for speed over deep reasoning. The target use cases are rapid code completion, large document processing, and iterative agent tasks where latency matters more than maximum intelligence.
What you give up
Privacy. The model card states: "Prompts and completions may be logged by the provider and used to improve the model." This isn't optional. If you're sending proprietary code, client data, or anything sensitive through Elephant Alpha, you're contributing to someone's training dataset.
Transparency. OpenRouter describes it as coming from "a prominent open model lab" but won't name the source. At 100B parameters, the candidates are limited - Llama 4 Scout (109B MoE), a Qwen variant, something from Cohere or Mistral. The community hasn't reached consensus yet.
Verified performance. OpenRouter claims it matches "SOTA performance of similar scale" but publishes no benchmark numbers. Zero. Not a single score on any eval. For a model asking you to trust it with your prompts, that's a notable absence.
Reasoning depth. No chain-of-thought or extended thinking mode. This is a fast, shallow model - useful for high-throughput tasks, less so for complex multi-step reasoning.
The stealth model pattern
OpenRouter has done this before. Their previous stealth drop, Hunter Alpha (a 1 trillion parameter model from March 2026), was widely assumed to be DeepSeek V4 testing in public. It turned out to be Xiaomi. Before that, Giga Potato served a similar role.
The pattern is consistent: drop an anonymous model at zero cost, collect usage data and community feedback, then unmask it once there's enough signal. The free pricing is temporary. The data collection is the point.
When to use it
If you need a free model with long context and tool use for non-sensitive work - prototyping agents, processing public documents, testing function calling pipelines - Elephant Alpha is hard to beat on price. 256K context with structured output at zero cost doesn't exist elsewhere.
If you care about knowing whose model is running your code, or you can't have your prompts logged, look elsewhere.
Sources:
Last updated
