Agent Platform Pricing Compared 2026

James Kowalski — Sun, 19 Apr 2026 00:00:00 +0000

TL;DR

Most "agent platforms" are two bills stacked: a platform fee and the underlying LLM API cost. Many vendors bury the second line item.
LangGraph Platform, CrewAI Enterprise, and Vellum all charge a platform/ops fee on top of your LLM API spend - model that from day one.
E2B code execution sandbox: ~$0.000168/second of sandbox time. At 30 seconds per run, 100k runs/month = ~$504 in sandbox fees alone, before LLM.
Modal serverless compute starts at $0.000054/GB-second - cheapest raw compute for agent backends by a significant margin.
AutoGen Studio and Agno (formerly Phidata) are open-source with no managed-cloud billing - you pay only for LLM API calls and your own infra.
Fly Machines: $0.0000019/second per shared-CPU machine - ideal for low-concurrency VM-per-agent patterns at minimal idle cost.
Lindy AI and Relevance AI both bundle LLM costs into their platform pricing, which hides true unit economics - read the fine print.
Mastra Cloud pricing is not yet publicly listed; request access only.

The Hidden Bill Problem

Most agent platform pricing pages show you the platform fee. They do not show you what that platform actually spends on your behalf. The pattern repeats across the market: a headline monthly floor, a per-run or per-seat cost, and then somewhere in the fine print - or not mentioned at all - the LLM API calls your agents make on each run.

I built the tables below using a standardized test agent to make the numbers comparable: a research-and-summarize agent that makes three LLM calls per run (one planning, one tool-use, one synthesis), uses approximately 3,000 input tokens and 800 output tokens total, and executes one web search per run. This is a modest, realistic agent - not a multi-step coding agent, not a simple chatbot.

For the LLM cost component, I use Claude Sonnet 4.5 at $3/1M input and $15/1M output as the reference model. Swap in your model of choice - the platform costs stay the same, only the LLM passthrough changes.

Methodology

All platform prices are sourced from public pricing pages, verified April 19, 2026. Where pricing is contact-sales only, I note that explicitly rather than guessing. Where vendors bundle LLM costs, I separate them where enough information is public to do so. "LLM passthrough" means the vendor routes your API calls through their infrastructure and either passes the cost through at list price or marks it up - both cases are noted.

For the cost-at-scale tables, the LLM component assumes the reference agent above: $0.009 + $0.012 = $0.021 per run in LLM costs (3,000 input tokens at $3/1M + 800 output tokens at $15/1M, rounded). Add one web search per run at Tavily Research pricing ($0.008). Total LLM+search cost per run: approximately $0.029. This cost is added to each platform's per-run fee in the scale table.

Ranked Pricing Table

Sorted by monthly floor cost. "LLM passthrough" column indicates whether the vendor adds markup on LLM calls or passes them through at API list price.

Platform	Monthly Floor	Per-Run Cost	LLM Passthrough	Notes
AutoGen Studio	$0 (self-host)	$0 platform	You pay API directly	Open-source only; no managed cloud
Agno (formerly Phidata)	$0 (self-host)	$0 platform	You pay API directly	Open-source; cloud beta, pricing TBD
Anthropic Claude Agent SDK	$0	$0 (SDK is free)	List price	Platform is free; pay Claude API only
OpenAI Agents SDK	$0	$0 SDK; tools add-ons extra	List price + surcharges	Code interpreter $0.03/session; file search $0.10/GB/day
Mastra Cloud	Not public	Not public	Unknown	Early access only
Fly Machines	~$0 idle	~$0.0055/run (30s run)	You pay API directly	$0.0000019/sec per shared-CPU
Modal	~$0 idle	~$0.0016/run (30s run)	You pay API directly	$0.000054/GB-sec; very cheap per-second billing
E2B	$0 free (10h/mo)	~$0.005/run (30s sandbox)	You pay API directly	$0.000168/sec; sandbox time billed per second
Daytona	Free tier	$0.006+/run	You pay API directly	Per workspace-hour; 2 free workspaces
LangGraph Platform	$39/mo (Plus)	Included in plan limits	List price	Plus: 100k LangGraph calls/mo; Pro: custom
Lindy AI	$49.99/mo	Included (Lindy credits)	Bundled - unclear markup	2,000 Lindy credits/mo on Starter; LLM cost opaque
Griptape Cloud	$0 free	Pay-as-you-go compute	You pay API directly	Free tier, then usage-based
Relevance AI	$19/mo (Team)	Per-credits system	Bundled - markup present	Credits cover LLM + platform; exact split not public
Vellum	$0 (Starter)	Metered on requests	List price (transparent)	Starter free; Growth $99+/mo
CrewAI Enterprise	Contact sales	Contact sales	Unknown	No public per-run pricing
Taskade AI	$16/mo (Pro)	Included up to limits	Bundled	AI tasks quota per plan; LLM cost hidden
Cognosys	No active public pricing	-	-	Site appears in maintenance

Cost at Scale

This table adds the LLM + search component ($0.029/run) to each platform's per-run cost. Self-hosted open-source platforms show only infra cost (Modal or Fly Machines assumed as the compute layer).

Platform	1k runs/mo	100k runs/mo	1M runs/mo
AutoGen (self-host, Modal compute)	$29 + $1.60 infra = $31	$2,900 + $160 infra = $3,060	$29,000 + $1,600 infra = $30,600
Agno (self-host, Modal compute)	$29 + $1.60 = $31	$2,900 + $160 = $3,060	$29,000 + $1,600 = $30,600
Anthropic Agent SDK	$29	$2,900	$29,000
OpenAI Agents SDK (no tools)	$29	$2,900	$29,000
OpenAI Agents SDK (code interpreter)	$29 + $30 = $59	$2,900 + $3,000 = $5,900	$29,000 + $30,000 = $59,000
Fly Machines + API	$29 + $5.50 = $35	$2,900 + $550 = $3,450	$29,000 + $5,500 = $34,500
Modal + API	$29 + $1.60 = $31	$2,900 + $160 = $3,060	$29,000 + $1,600 = $30,600
E2B + API	$29 + $5 = $34	$2,900 + $504 = $3,404	$29,000 + $5,040 = $34,040
LangGraph Platform (Plus)	$29 + $39 platform = $68	$2,900 + $39 platform = $2,939	$29,000 + custom = $29,000+
Lindy AI (Starter)	$50 platform + LLM opaque = $50+	Overage pricing unclear = ?	Enterprise = contact sales
Relevance AI (Team)	$19 platform + credits = $50	Overage = $200+	Enterprise = contact sales
Vellum (Starter)	$29 LLM + $0 platform = $29	$2,900 LLM + $99+ platform = $3,000+	Custom

The self-hosted path (AutoGen, Agno, or any framework on Modal) consistently delivers the lowest total cost because you're paying only LLM API list price and commodity compute. The managed platforms add platform overhead that only makes sense if the ops and observability value justifies it.

Per-Provider Breakdown

LangGraph Platform

Pricing: Free Developer plan (local deployment only). Plus plan: $39/month for managed deployment, includes 100,000 LangGraph cloud calls per month. Production and enterprise plans are contact-sales.

What you get: LangGraph is LangChain's graph-based agent framework - nodes and edges representing agent states and transitions. LangGraph Platform is the managed deployment layer: hosted execution, built-in persistence via checkpointers, a LangGraph Studio UI for debugging graph execution, streaming support, and webhooks. The platform handles agent state serialization and resumption, which is genuinely difficult to build well yourself. LangGraph Cloud also includes background task queuing, so your agent can run asynchronously and resume after interruptions.

Best fit: Teams already using LangChain ecosystem tooling who need managed persistence and observability without building their own orchestration layer. The checkpointer pattern - where agent state snapshots let you resume mid-execution after failures - is one of the more useful production features in any managed agent platform.

Gotchas: "LangGraph cloud calls" are not the same as LLM API calls. A single agent run may make many LangGraph calls (state transitions, tool invocations, checkpoint writes). Model your call fan-out before assuming the 100k/month Plus limit is sufficient. The $39/month base is reasonable, but large-scale deployments hit contact-sales quickly. LLM API costs run through your own keys at list price - not bundled.

Source: LangChain LangGraph Platform

CrewAI Enterprise

Pricing: Contact sales only. No public per-run or per-seat pricing as of April 2026. Open-source CrewAI is free (self-hosted).

What you get: CrewAI is a popular multi-agent framework where "crews" of specialized agents collaborate on tasks. CrewAI Enterprise adds managed execution, monitoring dashboards, role-based access control, audit logging, and support SLAs. The open-source library is MIT licensed and widely used; the enterprise offering wraps it in a SaaS deployment layer.

Best fit: Organizations that are already running CrewAI agents at scale in production and need managed infrastructure rather than DIY Kubernetes. The open-source path is viable for engineering-heavy teams; Enterprise is for teams that want to skip the ops work.

Gotchas: No public pricing means you cannot model costs without a sales conversation. This comparison cannot give you a number for CrewAI Enterprise per-run costs. If you're evaluating CrewAI against LangGraph Platform on cost grounds, LangGraph is the only one with a published price. LLM passthrough pricing in Enterprise is also undocumented publicly.

Source: CrewAI

AutoGen Studio and Microsoft Agents

Pricing: AutoGen Studio is open-source (MIT license). No managed cloud offering. Microsoft's Azure AI Foundry and Copilot Studio include agent capabilities - priced separately via Azure consumption.

What you get: AutoGen is Microsoft Research's multi-agent conversation framework. AutoGen Studio is a low-code interface for building AutoGen agents without writing Python. The core library is actively maintained and supports GroupChat (multi-agent coordination), code execution, and tool use. Microsoft's commercial agent surface is Copilot Studio, which is priced at $200/month per tenant for published agents plus $0.01 per message for Copilot Studio agents that exceed the included quota.

Best fit: AutoGen the library is best for teams building custom agentic workflows where multi-agent conversation patterns fit the problem. It's one of the most flexible orchestration frameworks available. AutoGen Studio suits rapid prototyping. For production scale, most teams deploy AutoGen on their own infrastructure (Modal, Fly.io, or Kubernetes).

Gotchas: AutoGen has no managed persistence or checkpointing built in at the platform level - you build that yourself. Microsoft's commercial agent products (Copilot Studio) are a separate product line aimed at enterprise customers and are not directly comparable to developer-facing frameworks. The $200/month Copilot Studio figure is per publishing tenant, not per agent run - the billing model is quite different from API-based platforms.

Source: AutoGen GitHub

Anthropic Claude Agent SDK

Pricing: The SDK itself is free. You pay only for Claude API usage. As of April 2026: Claude Sonnet 4.5 at $3/1M input tokens, $15/1M output tokens. Claude Haiku 3.5 at $0.80/1M input, $4/1M output. Claude Opus 4 at $15/1M input, $75/1M output.

What you get: Anthropic's Agent SDK (formerly the tool use + Claude API combination, now packaged as a structured SDK) provides the building blocks for creating agents: tool definitions, multi-turn orchestration, and built-in support for computer use, code execution, and web browsing tools. There is no managed execution environment - the SDK is a library you run on your own infrastructure.

Best fit: Teams building Claude-native agents who want first-party support for extended thinking, prompt caching (reduces cost significantly on repeated context), and computer use capabilities. The SDK is relatively lightweight; you provide the execution environment.

Gotchas: No platform fee is genuinely no platform fee - but Anthropic's model pricing is on the higher end compared to open-weight alternatives. At 1M runs/month on Sonnet 4.5 with our reference agent, you're looking at $29,000 in pure API costs. Prompt caching can reduce that meaningfully if your agent has a stable system prompt - cached tokens are billed at $0.30/1M input (10x reduction). Factor that in for agentic workloads with consistent context.

Source: Anthropic Pricing

OpenAI Agents SDK and Assistants API

Pricing: Agents SDK is free. GPT-4o: $2.50/1M input, $10/1M output. GPT-4o-mini: $0.15/1M input, $0.60/1M output. Assistants API surcharges: Code Interpreter $0.03/session. File Search: $0.10/GB/day for vector storage, $2.50/1K tool calls. Thread storage: $0.10/1K threads.

What you get: OpenAI's Agents SDK (released early 2025) provides a framework for building multi-agent workflows on top of the OpenAI API, with handoffs, guardrails, and tracing built in. The Assistants API adds persistent threads, built-in code execution via Code Interpreter, and file search over a vector store. These are production-ready managed capabilities - you do not need to build your own sandboxed code execution or RAG pipeline.

Best fit: Teams building on GPT models who need managed code execution or file search without spinning up their own infrastructure. The $0.03/session Code Interpreter cost is straightforward; at 100k sessions/month that's $3,000 - compare that to E2B's sandbox pricing for equivalent capability.

Gotchas: The surcharges stack. An agent using both Code Interpreter and File Search pays model costs plus two additional billing dimensions. File Search storage is per-day, so a large file corpus that sits idle still accrues cost. Vector storage billing is not capped - large knowledge bases in Assistants get expensive. The Agents SDK itself has no extra fee, but the tools it calls do.

Source: OpenAI API

Lindy AI

Pricing: Starter: $49.99/month for 2,000 Lindy credits. Pro: $99.99/month for 5,000 credits. Teams: $249.99/month for 15,000 credits. Business: $999.99/month for 70,000 credits. Additional credits: $0.02 each.

What you get: Lindy is a no-code platform for building workflow-based AI agents - email responders, meeting schedulers, CRM updaters, customer support agents. Agents are built visually with triggers, conditions, and action blocks. Lindy handles the execution environment, integrations (Gmail, Slack, Salesforce, 200+ others), and runs agents in the cloud. The credit system covers both the platform execution and LLM calls.

Best fit: Non-technical users and small teams building business workflow automations - think "AI employee" use cases rather than developer-facing agent APIs. The visual builder and pre-built integrations reduce time to first working agent dramatically.

Gotchas: The credit system bundles platform cost with LLM cost, making it impossible to audit what you're actually paying for model inference versus execution. At $0.02/credit overage and 2,000 credits on Starter, a busy agent can exhaust the plan quickly. Lindy's LLM credit consumption rates per action are not clearly documented in the pricing page, so "cost per run" is opaque until you observe it in production. Not suitable for developers who need per-run cost predictability.

Source: Lindy AI

Relevance AI

Pricing: Free: 100 credits/day. Team: $19/month for 10,000 credits/month. Business: $99/month for 100,000 credits/month. Enterprise: contact sales.

What you get: Relevance AI is an agent-builder platform with a visual workflow interface for creating and deploying AI agents and "tools" (small reusable LLM functions). The platform manages execution, provides a hosted API for each agent, and includes a built-in memory layer. Agents can use any LLM backend through Relevance's abstraction. The platform has decent observability and agent versioning.

Best fit: Teams that want a structured platform for building and deploying production agents without managing infrastructure. The credit model and visual builder are aimed at technical non-engineers or small engineering teams who want to move fast.

Gotchas: Relevance AI's credit model bundles LLM costs, but the conversion rate (credits to LLM tokens) is not clearly published. At scale, you're trusting Relevance's internal economics rather than auditing line-by-line. The $0 to $19 plan jump covers 100 credits/day vs. 10,000 credits/month - if your agent runs more than ~3 runs/day on the free tier, you're hitting limits constantly.

Source: Relevance AI Pricing

Vellum

Pricing: Starter: Free (limited usage). Growth: $99/month. Enterprise: contact sales. LLM API calls are passed through at list price from your own API keys.

What you get: Vellum is an LLMOps platform - prompt management, evaluation, deployment, and observability for LLM-powered applications. It includes workflow execution for multi-step agent pipelines and a prompt versioning system. Vellum's differentiator is its test-and-evaluate layer: you can run regression tests on prompts and compare model versions before deploying changes. For agent workflows, this means you can test whether a new model or prompt change broke your agent's behavior.

Best fit: Engineering teams building and iterating on production LLM applications who need systematic prompt management and regression testing. Less suited for one-shot agent deployments; more suited for applications where prompt quality is a continuous engineering concern.

Gotchas: Vellum does not execute agents at the scale of a dedicated agent runtime - it's an ops layer, not a high-throughput execution engine. The free Starter tier has usage limits that are not prominently quantified on the pricing page. LLM passthrough at list price is transparent, which is genuinely better than vendors who mark up model calls.

Source: Vellum

Agno (formerly Phidata)

Pricing: Open-source library is free (MIT license). Agno Cloud is in beta with pricing not publicly listed as of April 2026. Self-hosted deployment costs only LLM API and your own infra.

What you get: Agno (rebranded from Phidata in late 2024) is a Python framework for building multi-modal agents with memory, tools, and knowledge. It supports any LLM backend and includes structured agent responses, team coordination, and built-in tools for web search, database queries, and file operations. The framework is lightweight and has minimal runtime overhead compared to heavier orchestration layers.

Best fit: Python developers who want a clean, minimal-dependency framework for building agents without the LangChain ecosystem. Agno's team/multi-agent coordination is simpler to configure than AutoGen for straightforward delegation patterns.

Gotchas: Agno Cloud pricing is not public. If you build on the open-source library expecting a managed cloud option at a known price, you are currently blocked - request early access only. Self-hosting on Modal or Fly Machines is the pragmatic path until Cloud pricing is published.

Sources: Agno, Agno Docs

Mastra Cloud

Pricing: Not publicly listed. Early access only as of April 2026.

What you get: Mastra is a TypeScript-first agent framework (open-source) with Mastra Cloud as the managed deployment layer. The framework includes workflow graphs, agent memory, tool integrations, and an eval system. The TypeScript orientation makes it a natural fit for Next.js and Node.js shops building AI features on existing web infrastructure.

Best fit: TypeScript/Node.js teams who want a framework-native cloud deployment path. The open-source library is production-usable now; the Cloud offering adds managed execution and observability.

Gotchas: No public pricing means you cannot include Mastra Cloud in a cost model. The open-source framework is the current practical option.

Source: Mastra

Griptape Cloud

Pricing: Free tier available. Usage-based pay-as-you-go beyond free tier. Specific compute pricing not prominently published - the site lists free tier features but routes pricing inquiries to contact.

What you get: Griptape is a Python framework for building deterministic AI pipelines and agents. Griptape Cloud provides managed execution, logging, and a Pipeline-as-a-Service model where you deploy Griptape pipelines and the platform handles scheduling and execution. The framework emphasizes structured pipelines with explicit data flow rather than open-ended agentic loops.

Best fit: Teams building structured, repeatable AI pipelines where determinism and data flow control matter. Less suited for open-ended research agents; better for document processing pipelines, data extraction workflows, and scheduled automation.

Gotchas: Cloud pricing is not clearly documented on the public site. The framework itself is free; the Cloud platform requires direct contact for production pricing above the free tier.

Source: Griptape

E2B Code Execution Sandboxes

Pricing: Free: 10 hours of sandbox time per month. Hobby: $100/month for ~595 sandbox hours. Pro: custom pricing. Usage-based: $0.000168/second of sandbox time ($0.6048/hour). Sandbox provisioning: negligible, typically under 1 second.

What you get: E2B provides isolated Linux sandboxes for running AI-generated code. Each sandbox is a secure micro-VM that boots in under 200ms, executes code, and can persist for minutes or hours. E2B is purpose-built for AI coding agents that need to run untrusted code - the isolation model is designed for that threat surface. SDKs are available for Python and JavaScript, with direct integrations for major agent frameworks.

Best fit: Coding agents, data analysis agents, or any application where the agent generates and executes code as part of its workflow. E2B handles the security model you would otherwise have to build (container isolation, network policy, resource limits, filesystem sandboxing).

Gotchas: Cost compounds with sandbox runtime. At 30 seconds per run at $0.000168/second, each run costs $0.0050. At 100k runs/month that's $504 in pure sandbox costs, before LLM API costs. Long-running analysis tasks (60-120 seconds) double or quadruple that number. Monitor sandbox wall-clock time carefully - idle sandboxes waiting for user input still accrue cost. The 10 free hours are genuinely useful for development but burn down fast in automated test pipelines.

Source: E2B Pricing

Daytona

Pricing: Free: 2 workspaces, limited runtime. Paid plans are workspace-hour based. Pricing is not prominently listed; the site suggests contact for production pricing.

What you get: Daytona is a development environment manager designed for spinning up standardized dev environments rapidly. In the agent context, it's used for giving coding agents a full development environment rather than a minimal sandbox - a complete OS with language runtimes, package managers, and dev tooling. Environments are defined as code and boot in seconds.

Best fit: Coding agents that need a complete development environment rather than a bare sandbox. If your agent needs to clone a repo, install dependencies, run tests, and make changes, Daytona's full-environment model fits better than E2B's minimal sandbox.

Gotchas: Daytona's public pricing is sparse - production costs require a conversation with sales. The free 2-workspace limit is restrictive for concurrent agent workloads. Compare Daytona against E2B and Modal for code execution use cases; E2B has clearer public pricing for agent-scale workloads.

Source: Daytona

Pricing: Free: $30/month in compute credits. Pay-as-you-go: CPU at $0.000054/GB-second, GPU at $0.000583-$0.001855/GB-second (A10G to A100). Minimum billing: 100ms. Storage: $0.20/GB-month.

What you get: Modal is serverless Python compute - you decorate a function with @modal.function and Modal handles containerization, scaling, and execution. For agent backends, Modal is the cheapest way to run Python-based agents at variable load: you pay only for actual execution time, there are no idle costs, and the platform scales from zero to thousands of concurrent executions automatically. Cold start times are typically under 1 second for pre-warmed containers.

Best fit: Python agent backends with variable or bursty workloads. The per-second billing with 100ms minimum is the tightest billing granularity in this comparison. For agents that run for 5-30 seconds per invocation, Modal's pricing is extremely competitive compared to reserved compute or container services.

Gotchas: Modal bills per GB-second, which means memory allocation matters. A 1GB function running for 30 seconds costs $0.0016. A 4GB function running for 30 seconds costs $0.0065 - 4x more for the same wall-clock time. Right-size your memory allocation. Storage costs are also separate; large model caches or working data stored in Modal volumes add a per-GB-month charge.

Source: Modal Pricing

Fly Machines (VM-per-Agent)

Pricing: Shared CPU (1x): $0.0000019/second ($0.0000114/minute). Dedicated CPU: $0.0000115-$0.0000612/second. RAM: $0.0000001/MB/second. IPv4: $2/month per dedicated address. Free allowances: 3 shared-CPU VMs, 256MB RAM each, 3GB persistent volume storage.

What you get: Fly Machines are lightweight, rapidly-provisioned VMs on Fly.io's edge network. They boot in milliseconds and can be suspended when idle. The VM-per-agent pattern - spinning a dedicated VM for each concurrent agent session - is well-suited to Fly Machines: you get process isolation, persistent filesystem state, and the ability to run any workload without containerization constraints. Fly Machines can pause when idle and resume on incoming requests, keeping costs near zero between executions.

Best fit: Agents that need strong process isolation, persistent local state between steps, or workloads that don't fit neatly in a serverless execution model. Long-running agents that pause for user input benefit from the suspend/resume model. The free tier (3 shared VMs) is generous for development.

Gotchas: Fly Machines billing is more complex than Modal: CPU time, RAM, storage, and networking are all separate line items. A shared-CPU 256MB VM running for 30 seconds costs about $0.0055. That's fine for development; at 100k runs/month that's $550 in pure compute, competitive with E2B but without the security sandbox. You provide your own code isolation. IPv4 addresses at $2/month add up if you're running many persistent Machines.

Source: Fly.io Pricing

Taskade AI

Pricing: Starter: Free (limited AI tasks). Pro: $16/month (billed annually) for expanded AI quotas. Business: $49/month. Enterprise: contact sales. Exact AI task limits per tier are not prominently documented.

What you get: Taskade bundles an AI agent layer into a project management and collaboration tool. AI agents in Taskade can research topics, draft documents, search the web, and run custom workflows - all within the Taskade workspace environment. The AI is tightly integrated with task lists, notes, and team collaboration rather than exposed as a standalone agent API.

Best fit: Small teams and individuals who want AI agent capabilities embedded in their existing project management workflow, not a separate agent platform. Not suitable for building custom agent products - the API surface is limited.

Gotchas: LLM costs are bundled into the subscription tiers with no transparent per-run economics. "AI tasks" quotas vary by plan but specific numbers require digging through the help docs rather than the pricing page. This is not a developer platform - treat it as a productivity tool with AI features, not an agent deployment substrate.

Source: Taskade Pricing

Cognosys

Note: Cognosys.ai was an early web-based AI agent product. As of April 2026, the public site (cognosys.ai) returns a 402 error and appears to be in an inactive or maintenance state. Pricing cannot be verified. Do not plan new workloads on this platform.

The LLM Passthrough Problem

The single biggest gotcha across this market is LLM passthrough pricing - or rather, the absence of clear disclosure about it. Here is the breakdown:

Transparent (you pay list price through your own keys): AutoGen, Agno, Anthropic Agent SDK, OpenAI Agents SDK, Modal, Fly Machines, E2B, Vellum, LangGraph Platform.

Bundled (LLM costs included, economics opaque): Lindy AI, Relevance AI, Taskade AI.

Unknown (contact sales): CrewAI Enterprise, Mastra Cloud, Daytona, Griptape Cloud above free tier.

The transparent category is always preferable for cost modeling. Bundled pricing hides two things: the markup on LLM calls (some vendors mark up 2-3x), and the incentive to use cheaper models without telling you. When your agent platform controls the model choice and bundles the cost, you have no visibility into whether they switched from GPT-4o to GPT-4o-mini to protect their margin.

At 100k agent runs per month, the LLM bill at $2,900 typically dwarfs the platform fee. Optimize the model first, then fight over the platform overhead.

Free Tier Comparison

Platform	Free Tier	What's Included
AutoGen Studio	Unlimited	Self-hosted only; you pay LLM API
Agno	Unlimited	Self-hosted only; Cloud beta TBD
Anthropic Agent SDK	No platform fee	Pay Claude API from first token
OpenAI Agents SDK	No platform fee	Pay OpenAI API from first token
Modal	$30/month credits	~18,750 seconds of 1GB compute
Fly Machines	3 VMs free	Shared CPU, 256MB RAM each
E2B	10 hours/month sandbox	~36,000 seconds sandbox time
LangGraph Platform	Free (local only)	No cloud execution on free tier
Vellum	Starter free	Limited requests, not clearly quantified
Relevance AI	100 credits/day	~3 simple agent runs/day
Lindy AI	None	Paid plan required

Which Platform for Which Workload

Use Case	Recommended	Why
Maximum cost control, Python	AutoGen or Agno on Modal	$0 platform fee, cheapest compute
TypeScript shop	Mastra (OSS) on Fly Machines	Framework fit + cheap per-second VM billing
Managed persistence + observability	LangGraph Platform	Built-in checkpointing, $39/month transparent
Code execution in agent	E2B	Purpose-built sandbox, clear pricing
Long-running dev environment agents	Daytona or Fly Machines	Full OS environment, not a minimal sandbox
Bursty serverless agent backend	Modal	$0 idle, 100ms billing granularity
No-code business workflow agents	Lindy AI or Taskade	Visual builder, no engineering required
LLM ops + agent deployment	Vellum	Prompt management + transparent passthrough
Claude-native agents with caching	Anthropic Agent SDK	Prompt caching at $0.30/1M input tokens
Large-scale VM-per-agent isolation	Fly Machines	Sub-cent per 30s run, pause/resume

Price History

Early 2025 - OpenAI released the Agents SDK, consolidating the previous Assistants API tooling into a framework-level abstraction. Code Interpreter and File Search remained as separately billed capabilities.
2025 - LangGraph Platform launched paid tiers after a period of free beta. The $39/month Plus tier made LangGraph the first major framework to publish a specific managed deployment price.
Late 2024 - Phidata rebranded to Agno and announced Agno Cloud, beginning a closed beta with no public pricing.
2024 - E2B launched its usage-based pricing model for code execution sandboxes, targeting AI coding agents specifically. The per-second billing model influenced how developers think about agent compute costs.
2024 - Modal's credit-based free tier and per-GB-second pricing made it the default recommendation in the open-source agent community for serverless agent backends.
2024 - Relevance AI shifted from a per-seat model to a credit-based model that bundles LLM costs, reducing cost transparency.

FAQ

What is the cheapest way to run agents in production?

Self-hosting with AutoGen, Agno, or LangGraph (open-source) on Modal is the cheapest path: $0 framework cost, ~$0.0016/run in compute on Modal, plus LLM API at list price. You build your own persistence and observability. At our reference agent workload (3 LLM calls, ~3,800 tokens total), total cost per run is approximately $0.031 - almost entirely LLM cost.

Does LangGraph Platform include LLM costs?

No. LangGraph Platform charges for platform operations (state management, scheduling, graph execution) while LLM calls go through your own API keys at list price. The $39/month Plus tier covers platform overhead; model inference is billed separately. This is the transparent model - it is worth more than it might appear.

Is E2B or OpenAI Code Interpreter cheaper for running agent code?

At 1,000 sessions/month: OpenAI Code Interpreter at $0.03/session = $30. E2B at $0.000168/second - assuming a 30-second average session = $0.005/session = $5. E2B wins significantly on per-session cost. The tradeoff is that E2B requires you to manage the execution environment; Code Interpreter is fully managed within OpenAI's platform. At 100k sessions/month, E2B ($504) vs. Code Interpreter ($3,000) - E2B saves roughly $2,500/month at that volume.

Should I use Lindy or Relevance AI for a business automation?

Both bundle LLM costs and are better for non-technical users than for developers who need to control costs precisely. Lindy's credit model is clearer on paper but opaque on what actions consume how many credits. Relevance AI's tiered credit plans are similar. For building a durable production automation with known unit economics, either platform will require measurement in production - budget overruns are common with bundled credit models when agent complexity increases.

What happened to Cognosys?

Cognosys was an early consumer-facing AI agent product that allowed users to run autonomous research and task agents through a web UI. As of April 2026, the site returns a 402 (Payment Required) error and appears inactive. Do not build on this platform.

Sources:

Also see: LLM API Pricing Comparison 2026 and GPU Rental Pricing 2026.

Code Execution | Awesome Agents