Migrating from OpenAI API to Anthropic API

A practical guide to switching from OpenAI's chat completions to Anthropic's Messages API, covering endpoint mapping, tool use differences, and pricing.

From: OpenAI API To: Anthropic API Difficulty: Medium
Migrating from OpenAI API to Anthropic API

TL;DR

  • Yes, you can switch - Anthropic even offers an OpenAI SDK compatibility layer for quick testing
  • System message handling, image inputs, and tool calling schemas all change
  • You gain prompt caching (up to 90% savings), extended thinking, and PDF support
  • Medium difficulty, expect 2-4 hours for a typical integration

Why Switch from OpenAI to Anthropic?

Developers are increasingly moving workloads to the Anthropic API for a few concrete reasons. Claude's extended thinking gives you step-by-step reasoning traces, which helps with complex code generation and analysis tasks. Prompt caching can cut costs by up to 90% on repeated context, which matters at scale. And features like native PDF processing and citations aren't available through OpenAI's API at all.

Still, this isn't a drop-in swap. The request format is different, system messages work differently, and tool calling uses its own schema. If you've fine-tuned prompts specifically for GPT-4o or GPT-5, expect to spend time adjusting them for Claude.

Anthropic does offer an OpenAI SDK compatibility layer that lets you point the OpenAI Python or TypeScript SDK at Anthropic's endpoint. It's useful for quick comparisons, but it strips out Claude-specific features like prompt caching, extended thinking, and structured outputs. For production, you'll want the native API.

Feature Parity Table

FeatureOpenAIAnthropicNotes
Chat completionsPOST /v1/chat/completionsPOST /v1/messagesDifferent request/response format
Streamingstream: true (SSE)stream: true (SSE)Direct equivalent, different event names
System promptrole: "system" in messages arrayTop-level system parameterAnthropic uses a single system block
Function/tool callingtools[] with functionstools[] with input_schemaDifferent schema structure
Structured outputresponse_format with JSON schemaoutput_config or tool strict: trueBoth support strict validation
Image inputURL or base64 in content arrayBase64 or URL in content arrayBoth now support URLs
PDF inputNot supportedNative DocumentBlockParamAnthropic-only feature
Prompt cachingNot availablecache_control parameterUp to 90% input cost savings
Extended thinkingNot availablethinking parameterStep-by-step reasoning traces
EmbeddingsPOST /v1/embeddingsNot availableUse a separate provider
Audio inputSupportedNot supportedOpenAI-only feature
Batch processingBatch API (50% discount)Batch API (50% discount)Both offer async batching

API Mapping

The core difference is structural. OpenAI wraps everything in a chat/completions envelope with a choices array. Anthropic returns a message object with a content array of typed blocks.

Basic Chat Completion

OpenAI:

from openai import OpenAI

client = OpenAI(api_key="sk-...")

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Explain quantum computing in two sentences."}
    ],
    max_tokens=256,
    temperature=0.7
)

print(response.choices[0].message.content)

Anthropic:

import anthropic

client = anthropic.Anthropic(api_key="sk-ant-...")

response = client.messages.create(
    model="claude-sonnet-4-6",
    system="You are a helpful assistant.",
    messages=[
        {"role": "user", "content": "Explain quantum computing in two sentences."}
    ],
    max_tokens=256,
    temperature=0.7
)

print(response.content[0].text)

Three things changed. The system prompt moved from the messages array to a top-level system parameter. The response comes back as response.content[0].text instead of response.choices[0].message.content. And the model name uses Anthropic's naming scheme.

Streaming

OpenAI:

stream = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Write a haiku about APIs."}],
    stream=True
)

for chunk in stream:
    delta = chunk.choices[0].delta.content
    if delta:
        print(delta, end="")

Anthropic:

with client.messages.stream(
    model="claude-sonnet-4-6",
    messages=[{"role": "user", "content": "Write a haiku about APIs."}],
    max_tokens=100
) as stream:
    for text in stream.text_stream:
        print(text, end="")

Anthropic's Python SDK provides a .stream() context manager that yields text directly, which is actually cleaner than parsing delta objects. Under the hood, both use SSE, but the event types differ - Anthropic sends content_block_delta events while OpenAI sends chat.completion.chunk events.

Tool Calling

This is where the biggest differences show up.

OpenAI:

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "What's the weather in Paris?"}],
    tools=[{
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get current weather for a city",
            "parameters": {
                "type": "object",
                "properties": {
                    "city": {"type": "string", "description": "City name"}
                },
                "required": ["city"]
            }
        }
    }]
)

tool_call = response.choices[0].message.tool_calls[0]
print(tool_call.function.name, tool_call.function.arguments)

Anthropic:

response = client.messages.create(
    model="claude-sonnet-4-6",
    messages=[{"role": "user", "content": "What's the weather in Paris?"}],
    max_tokens=256,
    tools=[{
        "name": "get_weather",
        "description": "Get current weather for a city",
        "input_schema": {
            "type": "object",
            "properties": {
                "city": {"type": "string", "description": "City name"}
            },
            "required": ["city"]
        }
    }]
)

tool_block = next(b for b in response.content if b.type == "tool_use")
print(tool_block.name, tool_block.input)

Two structural differences matter here. OpenAI nests tools under type: "function" with a function key containing parameters. Anthropic puts name, description, and input_schema at the top level of each tool. Second, OpenAI returns tool calls as a separate tool_calls array on the message, while Anthropic includes them as tool_use content blocks in the same content array as text.

Image Input

OpenAI:

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{
        "role": "user",
        "content": [
            {"type": "text", "text": "What's in this image?"},
            {"type": "image_url", "image_url": {"url": "https://example.com/photo.jpg"}}
        ]
    }]
)

Anthropic:

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=256,
    messages=[{
        "role": "user",
        "content": [
            {"type": "text", "text": "What's in this image?"},
            {"type": "image", "source": {"type": "url", "url": "https://example.com/photo.jpg"}}
        ]
    }]
)

Both APIs now support image URLs directly. Anthropic also supports base64 encoding with "type": "base64" and a media_type field. One constraint to watch: Anthropic rejects images larger than 8000x8000 pixels, and if you send more than 20 images per request, each must be under 2000x2000.

Pricing Impact

Pricing depends heavily on which models you're comparing. Here are the most common matchups as of March 2026:

Use CaseOpenAI ModelPrice (in/out per 1M tokens)Claude ModelPrice (in/out per 1M tokens)
FlagshipGPT-5.2$1.75 / $14.00Claude Opus 4.6$5.00 / $25.00
Mid-tierGPT-4o$2.50 / $10.00Claude Sonnet 4.6$3.00 / $15.00
Fast/cheapGPT-4o mini$0.15 / $0.60Claude Haiku 4.5$1.00 / $5.00

For a workload processing 10 million input tokens and 2 million output tokens per month at the mid-tier:

  • GPT-4o: (10 x $2.50) + (2 x $10.00) = $45.00/month
  • Claude Sonnet 4.6: (10 x $3.00) + (2 x $15.00) = $60.00/month

Claude Sonnet costs about 33% more in this scenario. But if your workload has repeated context (like a system prompt used across thousands of requests), Anthropic's prompt caching changes the math notably. A cache hit costs 10% of the standard input price, so heavily cached workloads can end up cheaper on Anthropic despite the higher base rates.

Both providers offer 50% discounts through their batch APIs for non-latency-sensitive work.

Known Gotchas

  1. System messages get concatenated. If your OpenAI code places system messages at multiple points in the conversation, Anthropic will hoist and concatenate them all into one system block at the start. This can break prompts that rely on mid-conversation system instructions.

  2. max_tokens is required. OpenAI defaults to a model-specific maximum if you omit max_tokens. Anthropic requires it explicitly - you'll get an error if it's missing.

  3. No response_format for JSON mode. OpenAI's response_format: { type: "json_object" } doesn't have a direct equivalent. Use Anthropic's structured outputs (via output_config with a JSON schema) or tool use with strict: true instead.

  4. Temperature range differs. OpenAI accepts temperatures from 0 to 2. Anthropic caps at 1.0 and silently clamps anything higher. If you were using temperature 1.5 for creative tasks, your outputs will be less varied on Claude.

  5. No embeddings endpoint. Anthropic doesn't offer an embeddings API. If your pipeline creates embeddings, you'll need to keep OpenAI for that or switch to an alternative like Voyage AI or a self-hosted solution.

  6. n parameter must be 1. OpenAI lets you request multiple completions per call (e.g., n=3). Anthropic always returns exactly one response. You'll need to make separate API calls if you want multiple generations.

  7. Token counting uses different tokenizers. The same text will produce different token counts between OpenAI (tiktoken) and Anthropic. Don't assume your existing token estimates will transfer directly.

  8. Role alternation is strict. Anthropic requires messages to alternate between user and assistant roles. OpenAI is more lenient about consecutive messages from the same role. You may need to merge adjacent user messages before sending.

FAQ

Can I use the OpenAI SDK with Anthropic's API?

Yes. Set base_url to https://api.anthropic.com/v1/ and use your Anthropic API key. But this compatibility layer doesn't support prompt caching, structured outputs, or extended thinking.

Will my OpenAI prompts work without changes?

Basic prompts usually transfer well, but heavily optimized prompts will need tuning. Anthropic recommends their Console's prompt improver as a starting point for adaptation.

Is there a compatibility layer for production use?

The OpenAI SDK compatibility layer is for testing and evaluation only. Anthropic recommends their native SDK for production workloads to access all features.

How do I handle the missing embeddings endpoint?

Use a dedicated embeddings provider like Voyage AI, Cohere, or run an open-source model. See our embeddings guide for options.

Does Anthropic support fine-tuning?

Not through their public API as of March 2026. If you rely on OpenAI fine-tuned models, you'll need to replicate that behavior through prompt engineering or explore Anthropic's enterprise offerings.

What about rate limits?

Both providers use token-based and request-based rate limits that scale with your usage tier. Check Anthropic's rate limits documentation for current thresholds.


Sources:

✓ Last verified March 11, 2026

Migrating from OpenAI API to Anthropic API
About the author AI Education & Guides Writer

Priya is an AI educator and technical writer whose mission is making artificial intelligence approachable for everyone - not just engineers.