James Kowalski

AI Benchmarks & Tools Analyst

James is a software engineer turned tech writer who spent six years building backend systems at a fintech startup in Chicago before pivoting to full-time analysis of AI tools and infrastructure. His engineering background means he doesn't just read the spec sheet - he runs the benchmarks, profiles the latency, and checks whether the marketing claims hold up under real workloads.

He studied Computer Science at the University of Illinois at Urbana-Champaign, where he first got hooked on natural language processing during a senior research project on sentiment analysis. He later completed a certificate in data journalism from Northwestern's Medill School.

At Awesome Agents, James owns the leaderboards and tool comparison coverage. He maintains the site's benchmark tracking methodology and is the person who actually runs the numbers before publishing any ranking. He is also an open-source advocate and contributes to several projects in the LLM inference space.

Based in Chicago, IL.

Articles by James Kowalski

Best ChatGPT Alternatives in 2026

Best ChatGPT Alternatives in 2026

Eight ChatGPT alternatives compared on pricing, context limits, and real-world performance - from Claude and Gemini to DeepSeek and self-hosted setups.

Lumai Iris Nova: Optical AI Inference Server

Lumai Iris Nova: Optical AI Inference Server

Lumai's optical AI inference server uses light-based computing to run billion-parameter LLMs with up to 90% less power than GPUs.

Skymizer HTX301: 700B LLMs at 240W on PCIe

Skymizer HTX301: 700B LLMs at 240W on PCIe

Skymizer's HTX301 uses six 28nm chips and 384 GB LPDDR5 to run 700B-parameter LLMs on a single PCIe card at just 240W.

Meta MTIA 450: 18.4 TB/s Inference Accelerator

Meta MTIA 450: 18.4 TB/s Inference Accelerator

Meta's MTIA 450 doubles HBM bandwidth to 18.4 TB/s and adds FlashAttention hardware acceleration for GenAI inference in 2027.

AMD Helios: 72-GPU Rack for AI at Scale

AMD Helios: 72-GPU Rack for AI at Scale

AMD Helios packs 72 Instinct MI455X GPUs and 31 TB HBM4 into a single rack delivering 2.9 FP4 ExaFLOPS for AI workloads.

Meta MTIA 400: GenAI Inference at Scale

Meta MTIA 400: GenAI Inference at Scale

Meta's second-gen ASIC delivers 6 PFLOPS FP8 and 288 GB HBM for GenAI and recommendation inference inside Meta's data centers.

GAIA Benchmark Leaderboard: Best AI Agents May 2026

GAIA Benchmark Leaderboard: Best AI Agents May 2026

Rankings of the best AI models and agent frameworks on the GAIA benchmark, which tests real-world multi-step tasks requiring web browsing, tool use, and multi-hop reasoning.

SU-01

SU-01

SU-01 is a 30B-A3B MoE reasoning model from Shanghai AI Lab that achieves gold-medal performance on IMO 2025, USAMO 2026, and IPhO 2024/2025 using a three-stage training recipe and test-time scaling.

HiDream-O1-Image

HiDream-O1-Image

HiDream-O1-Image is an 8B open-source text-to-image model with a pixel-space diffusion architecture that outperforms 32B FLUX.2 [dev] across five major benchmarks.

SubQ

SubQ

SubQ is the first LLM built on a fully subquadratic attention architecture, achieving a 12M-token research context and 52x faster inference than FlashAttention at 1M tokens.

Reasoning Model API Pricing Compared - May 2026

Reasoning Model API Pricing Compared - May 2026

Updated May 2026: DeepSeek V4-Flash reasoning now $0.28/MTok output (8x cheaper than R1), o3-pro launched at $20/$80, Grok 4 retires May 15 - verified pricing across 11 models.

ZAYA1-8B

ZAYA1-8B

Zyphra's ZAYA1-8B is an 8.4B-parameter MoE reasoning model with only 760M active parameters that matches DeepSeek-R1-0528 on math and coding benchmarks while running at a fraction of the compute cost.