James Kowalski

James Kowalski

AI Benchmarks & Tools Analyst

James is a software engineer turned tech writer who spent six years building backend systems at a fintech startup in Chicago before pivoting to full-time analysis of AI tools and infrastructure. His engineering background means he doesn't just read the spec sheet - he runs the benchmarks, profiles the latency, and checks whether the marketing claims hold up under real workloads.

He studied Computer Science at the University of Illinois at Urbana-Champaign, where he first got hooked on natural language processing during a senior research project on sentiment analysis. He later completed a certificate in data journalism from Northwestern's Medill School.

At Awesome Agents, James owns the leaderboards and tool comparison coverage. He maintains the site's benchmark tracking methodology and is the person who actually runs the numbers before publishing any ranking. He is also an open-source advocate and contributes to several projects in the LLM inference space.

Based in Chicago, IL.

Articles by James Kowalski
GPT-5.4

GPT-5.4

OpenAI's most capable frontier model combines native computer use, 1M-token context, and three variants at $2.50/$15 per million tokens.

GPT-5.3 Instant

GPT-5.3 Instant

GPT-5.3 Instant launched March 3, 2026, cutting hallucinations by 26.8% and overhauling ChatGPT's tone - but with documented safety regressions in the process.

Best GEO Tools in 2026 - Top 5 Platforms Ranked

Best GEO Tools in 2026 - Top 5 Platforms Ranked

A ranked review of the five best Generative Engine Optimization platforms in 2026 - from full-stack content generation to enterprise monitoring, with pricing, benchmarks, and honest trade-offs.

GLM-5

GLM-5

Zhipu AI's GLM-5 is a 744B MoE model with 40B active parameters, trained on 100K Huawei Ascend chips, scoring 77.8% SWE-bench and 50 on Artificial Analysis Intelligence Index - MIT licensed.

Qwen3.5-0.8B

Qwen3.5-0.8B

Qwen3.5-0.8B is the smallest natively multimodal model in the Qwen 3.5 family - 0.8B parameters handling text, images, and video with 262K context. MathVista 62.2, OCRBench 74.5. Apache 2.0.

Qwen3.5-2B

Qwen3.5-2B

Qwen3.5-2B is a 2B dense multimodal model with 262K context, thinking mode, and native vision including video understanding. OCRBench 84.5, VideoMME 75.6. Apache 2.0 licensed.

Qwen3.5-4B

Qwen3.5-4B

Qwen3.5-4B is a 4B dense multimodal model that matches Qwen3-30B on MMLU-Pro and beats GPT-5-Nano on vision benchmarks. Runs on 8GB VRAM, Apache 2.0 licensed, 262K-1M context.