Articles Tagged "Transformers"

Agent Energy Costs, Memory Attacks, and Compute Limits

Three new papers reframe how we measure agent efficiency, defend agent memory from poisoning attacks, and calculate hard accuracy ceilings for transformers.

SubQ Launches: 12M-Token Context on Sub-Quadratic AI

Subquadratic exits stealth with SubQ, the first frontier model built on a sparse-attention architecture, a $29M seed round, and a 12M-token context window that costs a fraction of Opus.

OpenMythos Recasts Claude Mythos as Looped MoE Transformer

Kye Gomez open-sourced OpenMythos, a PyTorch reconstruction that hypothesizes Anthropic's Mythos is a Recurrent-Depth Transformer with Mixture-of-Experts routing and Multi-Latent Attention.

MoE Routing, Prompt Gambles, and Where Reasoning Breaks

Three new papers challenge assumptions in MoE routing design, prompt optimization workflows, and LLM reasoning chains - all published this week on arXiv.

Meta Demos Neural Computers - But They Can't Do Math

A 19-person Meta AI and KAUST team including Jürgen Schmidhuber proposes Neural Computers - systems where the neural network itself is the running computer, trained solely on screen recordings.

What Is an LLM? Large Language Models Explained

A large language model is an AI system trained on billions of words to understand and generate human language. Learn how LLMs work, what they can do, and how to pick the right one.

Transformers as Bayes Nets, Memory at Scale, Agent Attacks

Three arXiv papers rethink transformer theory, expose fatal flaws in in-context LLM memory, and introduce grey-box agent security testing.

Percepta Builds a Computer Inside a Transformer

Percepta AI compiled a WebAssembly interpreter into transformer weights, executing programs deterministically at 33K tokens/sec on CPU - but the community is skeptical about the practical value.

Ai2 Releases OLMo Hybrid - Open Transformer-RNN That Halves Token Cost

OLMo Hybrid combines transformer attention with Gated DeltaNet to match OLMo 3 accuracy using 49% fewer tokens and 75% better throughput on long contexts. Fully open - weights, checkpoints, training code, and technical report.

Etched Sohu - Transformer-Only Inference ASIC

Full specs and critical analysis of the Etched Sohu - a transformer-specific ASIC claiming 500K+ tokens/sec on Llama 70B, built on TSMC 4nm with 144GB HBM3E. Bold claims, but no independent benchmarks yet.