Sophie Zhang

AI Infrastructure & Open Source Reporter

Sophie is a journalist and former systems engineer who covers AI infrastructure, open-source models, and the developer tooling ecosystem. She spent three years as a site reliability engineer at a cloud provider in Seattle before transitioning to tech journalism, which gives her writing an unusual level of technical depth - she understands distributed systems, GPU clusters, and inference optimization from the inside.

She studied Computer Engineering at the University of British Columbia and later completed a science communication fellowship at MIT. Her engineering background means she can read a model card, spot a misleading benchmark, and explain why quantization matters - all in the same paragraph.

At Awesome Agents, Sophie covers AI infrastructure news: new model releases, open-source launches, developer tools, deployment trends, and the hardware that makes it all run. She has a soft spot for underdog open-source projects that punch above their weight and a sharp eye for when a "breakthrough" is really just better marketing.

Based in Seattle, WA.

Articles by Sophie Zhang

AirTrunk Commits $30B to 5GW India Data Centers

AirTrunk Commits $30B to 5GW India Data Centers

Blackstone-backed AirTrunk pledges $30 billion and 5GW of AI data center capacity in India by 2030 - more than triple the country's current total installed base.

Google Gemma 4 QAT Fits Frontier AI in Under 1GB

Google Gemma 4 QAT Fits Frontier AI in Under 1GB

Google DeepMind's new QAT checkpoints shrink the Gemma 4 E2B model to under 1GB, making serious on-device AI viable for phones and budget laptops.

NVIDIA Ships Nemotron 3 Ultra - 550B Open-Weight MoE

NVIDIA Ships Nemotron 3 Ultra - 550B Open-Weight MoE

NVIDIA's 550B Nemotron 3 Ultra, released June 4, tops the US open-weight leaderboard with a hybrid Mamba-Transformer MoE architecture and 300-plus tokens per second throughput.

NVIDIA Drops 110 Open-Source Skills for Physical AI Devs

NVIDIA Drops 110 Open-Source Skills for Physical AI Devs

NVIDIA's Agent Toolkit lands 110+ verified skills on GitHub covering robotics, autonomous vehicles, vision AI, and industrial systems - turning complex physical AI pipelines into single agent calls.

NVIDIA Dynamo Snapshot Slashes Kubernetes AI Cold Starts

NVIDIA Dynamo Snapshot Slashes Kubernetes AI Cold Starts

NVIDIA's Dynamo Snapshot uses CRIU and cuda-checkpoint to freeze and restore GPU inference containers in seconds, cutting Kubernetes cold-start times by up to 21x for large models.

Rapid-MLX Is 2.6x Faster Than Ollama on Apple Silicon

Rapid-MLX Is 2.6x Faster Than Ollama on Apple Silicon

New open-source inference engine for Apple Silicon benchmarks up to 2.6x faster than Ollama, supports 66 model aliases, and drops in as an OpenAI-compatible server on any Mac.

Microsoft ASSERT Converts AI Policies Into Test Suites

Microsoft ASSERT Converts AI Policies Into Test Suites

Microsoft's open-source ASSERT framework turns natural language behavior specs into executable, auditable test suites for AI agents and LLM applications.

Intel Crescent Island GPU Skips HBM for 480GB LPDDR5X

Intel Crescent Island GPU Skips HBM for 480GB LPDDR5X

Intel's Crescent Island inference GPU trades HBM bandwidth for 480GB of LPDDR5X, targeting customers locked out of NVIDIA's supply chain.

Claude Mythos Finds 10K Flaws in Critical Systems

Claude Mythos Finds 10K Flaws in Critical Systems

Anthropic expands Project Glasswing to 150 organizations across 15 countries, with Claude Mythos Preview surfacing 10,000 high-severity vulnerabilities since April.

New Open Standard Puts AI Agents Under Runtime Control

New Open Standard Puts AI Agents Under Runtime Control

The Agent Control Standard defines open middleware hooks that let teams block, allow, or modify AI agent actions before they reach production systems.

Microsoft Launches Polaris and Foundry Local at Build 2026

Microsoft Launches Polaris and Foundry Local at Build 2026

Microsoft's Build 2026 keynote ships Project Polaris to replace GPT-4 in GitHub Copilot by August and declares Foundry Local generally available for zero-cloud on-device inference.

Nvidia Enters the PC Market With RTX Spark Superchip

Nvidia Enters the PC Market With RTX Spark Superchip

Nvidia's RTX Spark packs 20 Arm CPU cores and a Blackwell 2.0 GPU with 6,144 CUDA cores into a 45-80W Windows laptop chip, targeting Apple Silicon head-on.