
Distillation Leaks, Weak Agents, and Research Sabotage
New papers show distillation silently transfers unsafe behaviors, weak agents bottleneck multi-agent pipelines, and frontier AI can't reliably audit sabotaged ML research.
They summarize our coverage. We write it.
Newsletters like this one rebroadcast our headlines - often without the full review, the source reading, or the analysis underneath. Our weekly briefing sends the work they paraphrase, straight from the desk, before they get to it.
Free, weekly, no spam. One email every Tuesday. Unsubscribe anytime.

New papers show distillation silently transfers unsafe behaviors, weak agents bottleneck multi-agent pipelines, and frontier AI can't reliably audit sabotaged ML research.

OpenAI, Anthropic, Google, and Microsoft are now sharing attack detection data through the Frontier Model Forum to collectively block Chinese adversarial distillation campaigns.

New details reveal Apple has full data center access to Gemini and can create smaller on-device derivative models - far more control than the original deal disclosed.
![FLUX.2 [klein] 9B](https://awesomeagents.ai/images/models/flux-2-klein-9b_hu_25add23b30e4a4ae.jpg)
Black Forest Labs' 9B parameter distilled image model - sub-second generation with higher quality than the 4B variant, 19.6 GB VRAM, non-commercial license.

A community fine-tune distills Claude Opus 4.6 chain-of-thought reasoning into Qwen3.5-27B via LoRA, racking up 4,000+ downloads in days. No benchmarks yet - but the approach raises familiar questions.

Community fine-tune that distills Claude Opus 4.6 reasoning into Qwen3.5-27B via LoRA. 28B parameters, Apache 2.0, no published benchmarks.

Comparing the Claude Opus reasoning-distilled Qwen3.5-27B against the base model - what chain-of-thought distillation adds and what it costs in context, multimodal, and reliability.

Claude Sonnet 4.6 identifies itself as DeepSeek when prompted in Chinese, just one day after Anthropic accused DeepSeek of industrial-scale distillation attacks. The cause is training data contamination, not an identity crisis - but the timing is spectacular.

Anthropic accuses three Chinese AI labs of industrial-scale distillation attacks using 24,000 fraudulent accounts and 16 million exchanges with Claude. MiniMax ran the largest operation at 13 million exchanges. None of the three companies have responded.

TeichAI, a four-person non-profit, generated 250 reasoning samples from Claude Opus 4.5, fine-tuned open-weight models on the result, and racked up 67,000 downloads. The legal and technical implications are more interesting than the benchmarks.

A practical, hands-on guide for software developers who want to finetune open-source LLMs and distill larger models into smaller, faster ones - covering techniques, tools, datasets, and cloud GPU options.