Update claude-api skill: per-SDK doc split, code_execution_20260521, platform-availability, onboarding streamline (#1363)

This commit is contained in:
Lance Martin
2026-06-27 09:07:56 -07:00
committed by GitHub
parent 5754626092
commit 35414756ca
46 changed files with 2490 additions and 1635 deletions

View File

@@ -26,12 +26,26 @@ Never mix the two — don't reach for `requests`/`fetch` in a Python or TypeScri
**Never guess SDK usage.** Function names, class names, namespaces, method signatures, and import paths must come from explicit documentation — either the `{lang}/` files in this skill or the official SDK repositories or documentation links listed in `shared/live-sources.md`. If the binding you need is not explicitly documented in the skill files, WebFetch the relevant SDK repo from `shared/live-sources.md` before writing code. Do not infer Ruby/Java/Go/PHP/C# APIs from cURL shapes or from another language's SDK.
**If WebFetch or repository access fails** (network restricted, timeouts, clone blocked): do not keep retrying — write code from the patterns and namespace/package tables in the `{lang}/` file, run the compiler or interpreter on it, and iterate on the error output. For statically-typed SDKs (C#, Java, Go) a compile-fix loop against local errors reaches working code faster than blocked network research.
## Defaults
Unless the user requests otherwise:
For the Claude model version, please use Claude Opus 4.8, which you can access via the exact model string `claude-opus-4-8`. Please default to using adaptive thinking (`thinking: {type: "adaptive"}`) for anything remotely complicated. And finally, please default to streaming for any request that may involve long input, long output, or high `max_tokens` — it prevents hitting request timeouts. Use the SDK's `.get_final_message()` / `.finalMessage()` helper to get the complete response if you don't need to handle individual stream events
## ⚠️ API Drift — Your Training Prior May Be Stale
Several common Claude API shapes changed in 20252026. If you recall a pattern from training, verify it against the `{lang}/` files in this skill before writing — the rows below are the most frequent drift points:
| Area | Stale prior | Current API |
|---|---|---|
| Extended thinking | `thinking: {type: "enabled", budget_tokens: N}` | On Claude 4.6+ models: `thinking: {type: "adaptive"}`. `budget_tokens` is deprecated on Opus 4.6 / Sonnet 4.6 and **rejected with a 400** on Fable 5 / Opus 4.8 / 4.7. Pre-4.6 models still use `budget_tokens`. |
| Web search / web fetch tool type | `web_search_20250305`, `web_fetch_20250910` | `web_search_20260209`, `web_fetch_20260209` (dynamic filtering) on Opus 4.8/4.7/4.6 and Sonnet 4.6. Older models keep the basic variants; on Vertex AI only basic `web_search_20250305` is available (web fetch is not on Vertex) — see the Server Tools QR below. |
| PHP parameter names | snake_case wire names as named args (`max_tokens`) | Top-level named args are camelCase (`maxTokens`). Nested array keys vary by feature (e.g. `'taskBudget'`, `'skillID'`, `'mcp_server_name'`) — copy the exact key from the documented example; do not bulk-convert. |
The `{lang}/` files in this skill are authoritative over recalled patterns.
---
## Subcommands
@@ -111,7 +125,7 @@ Before reading code examples, determine which language the user is working in:
> **Note:** Managed Agents is the right choice when you want Anthropic to run the agent loop *and* host the container where tools execute — file ops, bash, code execution all run in the per-session workspace. If you want to host the compute yourself or run your own custom tool runtime, Claude API + tool use is the right choice — use the tool runner for automatic loop handling, or the manual loop for fine-grained control (approval gates, custom logging, conditional execution).
> **Cloud-provider access.** **Claude Platform on AWS** is Anthropic-operated with same-day API parity — Managed Agents and every feature in this skill work there, **except self-hosted sandboxes** (see `shared/claude-platform-on-aws.md`). **Amazon Bedrock**, **Google Vertex AI**, and **Microsoft Foundry** do **not** support Managed Agents or Anthropic server-side tools; use **Claude API + tool use** on those.
> **Cloud-provider access.** **Claude Platform on AWS** is Anthropic-operated with same-day API parity — see `shared/claude-platform-on-aws.md` for client setup. For per-feature availability on **Claude Platform on AWS**, **Amazon Bedrock**, **Google Vertex AI**, and **Microsoft Foundry**, see `shared/platform-availability.md` — that table is the single source of truth in this skill; do not infer availability from anywhere else.
### Decision Tree
@@ -119,8 +133,8 @@ Before reading code examples, determine which language the user is working in:
What does your application need?
0. Which provider?
├── First-party API or Claude Platform on AWS → continue (full surface available).
└── Amazon Bedrock, Google Vertex AI, or Microsoft Foundry → Claude API (+ tool use for agents); Managed Agents not available there.
├── First-party API or Claude Platform on AWS → continue (full surface available; per-feature exceptions in shared/platform-availability.md).
└── Amazon Bedrock, Google Vertex AI, or Microsoft Foundry → Claude API (+ tool use for agents); see shared/platform-availability.md for per-feature support.
1. Single LLM call (classification, summarization, extraction, Q&A)
└── Claude API — one request, one response
@@ -186,9 +200,9 @@ Everything goes through `POST /v1/messages`. Tools and output constraints are fe
Claude Fable 5 is Anthropic's most capable widely released model, for the most demanding reasoning and long-horizon agentic work. **Claude Mythos 5** (`claude-mythos-5`) offers the same capabilities, pricing, and API surface through Project Glasswing (participation is the only way to access it), succeeding the invitation-only Claude Mythos Preview (`claude-mythos-preview`) — everything below applies to both models. 1M context window (the maximum is also the default), 128K max output. Key API differences from Opus-tier — see `shared/model-migration.md` → Migrating to Claude Fable 5 for details:
- **Thinking is always on** — omit the `thinking` parameter entirely (or send `{type: "adaptive"}`). Any other explicit configuration is rejected: `{type: "disabled"}` and `{type: "enabled", budget_tokens: N}` both return a 400. Control depth with `output_config.effort` (supports `low` through `xhigh` and `max`).
- **Protected thinking = the raw chain of thought, not the summary** — responses carry regular `thinking` blocks (not `redacted_thinking`): `display: "summarized"` returns a readable summary, `"omitted"` (the default) leaves the `thinking` field as an empty string; the raw chain of thought is never exposed on any model. Replay rules: pass thinking blocks back exactly as received on the same model (including empty-text blocks — the API rejects *modified* blocks, not read ones); a **different** model **silently ignores** them (not an error), but ignored blocks still bill input tokens — strip them when switching models for good.
- **New tokenizer** — the same content tokenizes to roughly 30% more tokens than on Opus-tier models. Don't reuse token counts or `max_tokens` settings measured on other models; re-baseline with `count_tokens`.
- **`refusal` stop reason** — safety classifiers may decline a request (HTTP 200, `stop_reason: "refusal"`, with a `stop_details` category). A pre-output refusal has an empty `content` array and is not billed at all; a mid-stream refusal bills the already-streamed output — discard the partial output. Always check `stop_reason` before reading `content`. To retry on another model: the beta `fallbacks` parameter (Claude API and Claude Platform on AWS) retries server-side in one round trip; the GA SDKs' `BetaRefusalFallbackMiddleware` + `BetaFallbackState` handle client-side retry everywhere else (incl. Bedrock/Vertex); fallback credit refunds the cache-switch cost of client-side retries. See the migration guide's refusal section.
- **The raw chain of thought is never returned** — responses carry regular `thinking` blocks (not `redacted_thinking`): `display: "summarized"` returns a readable summary, `"omitted"` (the default) leaves the `thinking` field as an empty string. Replay rules: pass thinking blocks back exactly as received on the same model (including empty-text blocks — the API rejects *modified* blocks, not read ones); a **different** model **drops** them from the prompt (typically silently — not an error; the drop happens before pricing, so dropped blocks aren't billed and there's nothing to strip). Regular thinking blocks from other models replay across models freely.
- **Tokenizer** — same tokenizer as Opus 4.8 (introduced with Opus 4.7). Token counts are roughly unchanged when migrating from Opus 4.7/4.8; per-token pricing differs. Coming from Opus 4.6, Sonnet, Haiku, or older, re-baseline with `count_tokens`.
- **`refusal` stop reason — handle it, and opt into fallbacks by default** — safety classifiers may decline a request (HTTP 200, `stop_reason: "refusal"`, with a `stop_details` category). A pre-output refusal has an empty `content` array and is not billed at all; a mid-stream refusal bills the already-streamed output — discard the partial output. Always check `stop_reason` before reading `content`. Recovery is **opt-in on the API**: most Claude consumer surfaces ship with built-in Claude Opus 4.8 fallbacks, but an API request that doesn't opt in simply stops on a refusal — and false positives on benign adjacent work (security tooling, life-sciences tasks) do happen. **When you write `claude-fable-5` code, include the server-side `fallbacks` parameter by default** (`betas: ["server-side-fallback-2026-06-01"]` + `fallbacks: [{"model": "claude-opus-4-8"}]`; Claude API and Claude Platform on AWS): a declined request is transparently re-served by the fallback model inside the same call, with credit-style repricing applied automatically (a decline before any output isn't billed; the rescue bills at the fallback model's own rates). Tell the user you've enabled it; drop it only if they decline. The GA SDKs' client-side `BetaRefusalFallbackMiddleware` + `BetaFallbackState` handle retry everywhere server-side fallbacks aren't supported (incl. Amazon Bedrock, Vertex AI, Microsoft Foundry); fallback credit refunds the cache-switch cost of client-side retries. Code examples: the Refusal Fallbacks section of your language's claude-api doc; full semantics in the migration guide's refusal section.
- **No assistant prefill** — same as the rest of the 4.6+ family.
- **30-day data retention required** — Claude Fable 5 is not available under zero data retention; requests from an org whose retention configuration doesn't meet the requirement return `400 invalid_request_error`.
- **Longer turns, different prompting** — single requests on hard tasks can run many minutes (plan timeouts/streaming/progress UX); effort sweeps should include low/medium for routine work; prompts written for prior models are often too prescriptive and reduce output quality. See `shared/model-migration.md` → Migrating to Claude Fable 5 → Behavioral shifts (prompt-tunable) for the recommended prompt snippets (anti-overplanning, no-tidying, grounded progress claims, boundaries, async sub-agents, memory, `send_to_user`).
@@ -231,7 +245,7 @@ See `{lang}/claude-api/README.md` (Compaction section) for code examples. Full d
**Prefix match.** Any byte change anywhere in the prefix invalidates everything after it. Render order is `tools``system``messages`. Keep stable content first (frozen system prompt, deterministic tool list), put volatile content (timestamps, per-request IDs, varying questions) after the last `cache_control` breakpoint.
**Mid-conversation operator instructions** (beta header `mid-conversation-system-2026-04-07`, on supporting models): append `{"role": "system", ...}` to `messages[]` instead of editing top-level `system`. Preserves the cached history prefix and is the prompt-injection-safe operator channel. See `shared/prompt-caching.md` § Mid-conversation system messages.
**Mid-conversation operator instructions** (Claude Opus 4.8 only; no beta header): append `{"role": "system", ...}` to `messages[]` instead of editing top-level `system`. Preserves the cached history prefix and is the prompt-injection-safe operator channel. See `shared/prompt-caching.md` § Mid-conversation system messages.
**Top-level auto-caching** (`cache_control: {type: "ephemeral"}` on `messages.create()`) is the simplest option when you don't need fine-grained placement. Max 4 breakpoints per request. Minimum cacheable prefix is ~1024 tokens — shorter prefixes silently won't cache.
@@ -241,11 +255,128 @@ For placement patterns, architectural guidance, and the silent-invalidator audit
---
## Fast Mode (Quick Reference)
**Research preview, Opus 4.8 / 4.7 only.** Opus 4.7 fast mode is deprecated — after removal, `speed: "fast"` on 4.7 returns an error. Opus 4.8 is the durable fast-capable tier. Fast mode runs the same model at up to 2.5x higher output tokens per second, at premium pricing. Three things are required on every request: use the **beta** messages endpoint (`client.beta.messages.…`), pass the beta flag `fast-mode-2026-02-01`, and set `speed: "fast"` as a top-level request parameter (not a header, not in `extra_body`).
```python
client.beta.messages.create(
model="claude-opus-4-8", max_tokens=4096,
speed="fast", betas=["fast-mode-2026-02-01"],
messages=[...],
)
```
| Language | Beta flag | Speed parameter |
|---|---|---|
| Python | `betas=["fast-mode-2026-02-01"]` | `speed="fast"` |
| TypeScript / Ruby | `betas: ["fast-mode-2026-02-01"]` | `speed: "fast"` |
| Go | `[]anthropic.AnthropicBeta{anthropic.AnthropicBetaFastMode2026_02_01}` | `Speed: anthropic.BetaMessageNewParamsSpeedFast` |
| Java | `.addBeta(AnthropicBeta.FAST_MODE_2026_02_01)` | `.speed(MessageCreateParams.Speed.FAST)` |
| C# | `Betas = ["fast-mode-2026-02-01"]` | `Speed = Speed.Fast` (`Anthropic.Models.Beta.Messages`) |
| PHP | `betas: ['fast-mode-2026-02-01']` | `speed: 'fast'` |
| cURL | `anthropic-beta: fast-mode-2026-02-01` header | `"speed": "fast"` in body |
`response.usage.speed` reports which speed was used. Fast mode has its own rate limit separate from standard Opus; on 429, either retry after the `retry-after` delay or drop `speed` and fall back to standard (note: switching speed invalidates prompt cache). Not available with Batch API, Priority Tier, Claude Platform on AWS, or third-party platforms.
---
## Task Budgets (Quick Reference)
**Beta, Fable 5 / Opus 4.8 / 4.7.** A task budget gives Claude a token ceiling for an agentic loop so it paces itself and finishes gracefully instead of being cut off. Set `task_budget` inside `output_config` on `client.beta.messages.stream(...)` with beta flag `task-budgets-2026-03-13` — use streaming so the large `max_tokens` doesn't hit HTTP timeouts:
```python
with client.beta.messages.stream(
model="claude-opus-4-8", max_tokens=128000,
output_config={"effort": "high", "task_budget": {"type": "tokens", "total": 64000}},
betas=["task-budgets-2026-03-13"],
messages=[...], tools=[...],
) as stream:
response = stream.get_final_message()
```
`task_budget` fields: `type` (always `"tokens"`), `total`, and optional `remaining` (defaults to `total`). The server injects a countdown marker Claude sees during generation; the budget counts what Claude generates and the tool results it reads this turn — **not** the full history you resend each request.
**Observing spend:** accumulate `response.usage.output_tokens` (plus the token count of the tool-result blocks you append) across loop iterations if you want to display progress. Leave `remaining` unset in the normal loop — the server tracks the countdown itself, and passing a client-computed `remaining` while also resending full history under-reports the budget. **Only pass `remaining`** when you compact or rewrite history between requests and the server can no longer derive prior spend.
---
## Provider Clients (Quick Reference)
When targeting Claude on a third-party platform, use that platform's dedicated client class — not the first-party `Anthropic()` client with a `base_url` override. After construction the client exposes the same `messages.create` / `.stream` surface as the first-party SDK.
### Amazon Bedrock
Use the **Mantle** client (Messages-API Bedrock endpoint). Bedrock model IDs take an `anthropic.` prefix (e.g. `"anthropic.claude-opus-4-8"`). Region is required.
| Language | Client |
|---|---|
| Python | `from anthropic import AnthropicBedrockMantle``AnthropicBedrockMantle(aws_region="…")` |
| TypeScript | `import { AnthropicBedrockMantle } from "@anthropic-ai/bedrock-sdk"``new AnthropicBedrockMantle({ awsRegion: "…" })` |
| Go | `bedrock.NewMantleClient(ctx, bedrock.MantleClientConfig{ AWSRegion: "…" })` |
| Java | `AnthropicOkHttpClient.builder().backend(BedrockMantleBackend.fromEnv()).build()` (from `com.anthropic.bedrock.backends`) |
| C# | `new AnthropicBedrockMantleClient(new() { AwsRegion = "…" })` (package `Anthropic.Bedrock`) |
| PHP | `use Anthropic\Bedrock\MantleClient;``new MantleClient(awsRegion: '…')` |
| Ruby | `Anthropic::BedrockMantleClient.new(aws_region: "…")` |
`AnthropicBedrock` / `BedrockClient` / `BedrockBackend` (without `Mantle`) are the legacy `bedrock-runtime` InvokeModel path — prefer the Mantle client for new code.
### Microsoft Foundry
| Language | Client |
|---|---|
| Python | `from anthropic import AnthropicFoundry``AnthropicFoundry(api_key=…, resource="…")` |
| TypeScript | `import AnthropicFoundry from "@anthropic-ai/foundry-sdk"``new AnthropicFoundry({ … })` |
| Java | `AnthropicOkHttpClient.builder().backend(FoundryBackend.fromEnv()).build()` (from `com.anthropic.foundry.backends`) |
| C# | `new AnthropicFoundryClient(new AnthropicFoundryApiKeyCredentials(…))` (package `Anthropic.Foundry`) |
| PHP | `Foundry\Client::withCredentials(…)` |
The Go and Ruby SDKs do not currently support Foundry. For Ruby, use the standard `Anthropic::Client.new(base_url: "<foundry endpoint>")` as a fallback (Entra ID auth is not built in). For Claude Platform on AWS, see `shared/claude-platform-on-aws.md`.
### Google Cloud Vertex AI
Two required constructor args: GCP `project_id` and `region`. Vertex model IDs take **no prefix** — current-generation models (Opus 4.8/4.7/4.6, Sonnet 4.6) use the bare first-party ID (e.g. `"claude-opus-4-8"`); dated-snapshot models use an `@` version separator (e.g. `claude-opus-4-5@20251101`, **not** `claude-opus-4-5-20251101`). Auth is GCP ADC (`gcloud auth application-default login`); no Anthropic API key. `region` can be `"global"` (recommended), a multi-region (`"us"`/`"eu"`), or a specific region. After construction, use the same `messages.create` / `.stream` surface.
| Language | Client |
|---|---|
| Python | `from anthropic import AnthropicVertex``AnthropicVertex(project_id="…", region="…")` (install `"anthropic[vertex]"`) |
| TypeScript | `import { AnthropicVertex } from "@anthropic-ai/vertex-sdk"``new AnthropicVertex({ projectId, region })` |
| Go | `import "github.com/anthropics/anthropic-sdk-go/vertex"``anthropic.NewClient(vertex.WithGoogleAuth(ctx, region, projectID))` |
| Java | `AnthropicOkHttpClient.builder().backend(VertexBackend.builder().region("…").project("…").build()).build()` (from `com.anthropic.vertex.backends`) |
| C# | `new AnthropicClient { Backend = new VertexBackend(projectId, region) }` (package `Anthropic.Vertex`) |
| PHP | `use Anthropic\Vertex;``Vertex\Client::fromEnvironment(location: '…', projectId: '…')` — note `location`, not `region` |
| Ruby | `Anthropic::VertexClient.new(region: "…", project_id: "…")` |
---
## Context Editing (Quick Reference)
**Beta.** Context editing **clears** old tool results or thinking blocks from the conversation before the model sees it; it is **not compaction** (which summarizes). On `client.beta.messages.*` with beta `context-management-2025-06-27`, pass `context_management.edits` with a strategy type:
```python
client.beta.messages.create(
model="claude-opus-4-8", max_tokens=4096,
betas=["context-management-2025-06-27"],
context_management={"edits": [{"type": "clear_tool_uses_20250919"}]},
tools=[...], messages=[...],
)
```
Strategy types: `clear_tool_uses_20250919` (clears old tool results; optional `clear_tool_inputs: true` also clears the tool_use params) and `clear_thinking_20251015` (clears thinking blocks). Do **not** use `compact_20260112` or beta `compact-2026-01-12` — those are the separate compaction feature.
---
## Mid-Conversation System Messages (Quick Reference)
**Claude Opus 4.8 only; no beta header.** Append `{"role": "system", "content": "…"}` to the `messages` array (not the top-level `system` field) to add an operator instruction mid-conversation without invalidating the cached prefix. Use the regular `client.messages.create` — there is no beta. A mid-conversation system message must follow a `user` message (or an `assistant` message ending in server-tool use), and must be either the last entry in `messages` or be followed by an `assistant` turn — it cannot be `messages[0]`. Availability: `shared/platform-availability.md`. See `shared/prompt-caching.md` § Mid-conversation system messages.
---
## Managed Agents (Beta)
**Managed Agents** is a third surface: server-managed stateful agents with Anthropic-hosted tool execution. You create a persisted, versioned Agent config (`POST /v1/agents`), then start Sessions that reference it. Each session provisions a container as the agent's workspace — bash, file ops, and code execution run there; the agent loop itself runs on Anthropic's orchestration layer and acts on the container via tools. The session streams events; you send messages and tool results back.
**Managed Agents is available on the first-party API and Claude Platform on AWS.** It is **not** available on Amazon Bedrock, Google Vertex AI, or Microsoft Foundry — for agents there, use Claude API + tool use.
Availability: `shared/platform-availability.md`. For agents on Bedrock / Vertex / Foundry (where Managed Agents is unsupported), use Claude API + tool use.
**Mandatory flow:** Agent (once) → Session (every run). `model`/`system`/`tools` live on the agent, never the session. See `shared/managed-agents-overview.md` for the full reading guide, beta headers, and pitfalls.
@@ -255,7 +386,7 @@ For placement patterns, architectural guidance, and the silent-invalidator audit
| Subcommand | Action |
|---|---|
| `managed-agents-onboard` | Walk the user through setting up a Managed Agent from scratch. **Read `shared/managed-agents-onboarding.md` immediately** and follow its interview script: mental model → know-or-explore branch → template config → session setup → **pre-flight viability check** → emit code. The viability check (reconcile the stated job against configured tools/credentials/data) catches under-resourced setups — missing a tool, credential, or data access — before the agent burns budget. Do not summarize — run the interview. |
| `managed-agents-onboard` | Walk the user through setting up a Managed Agent from scratch. **Read `shared/managed-agents-onboarding.md` immediately** and follow its interview script: **describe → configure the agent (propose, don't interrogate) → environment → session** (same arc as the Console quickstart, auth deferred to the session step) — defaults and inline suggestions do the work, with a silent viability gate (job vs tools/credentials/data) before any code is emitted. Do not summarize — run the interview. |
**Reading guide:** Start with `shared/managed-agents-overview.md`, then the topical `shared/managed-agents-*.md` files (core, environments, tools, events, outcomes, multiagent, webhooks, memory, scheduled-deployments, client-patterns, onboarding, api-reference). For Python, TypeScript, Go, Ruby, PHP, and Java, read `{lang}/managed-agents/README.md` for code examples. For cURL, read `curl/managed-agents.md`. **Agents are persistent — create once, reference by ID.** Store the agent ID returned by `agents.create` and pass it to every subsequent `sessions.create`; do not call `agents.create` in the request path. The Anthropic CLI (`ant`) is one convenient way to create agents and environments from version-controlled YAML — see `shared/anthropic-cli.md`. If a binding you need isn't shown in the language README, WebFetch the relevant entry from `shared/live-sources.md` rather than guess. C# has beta Managed Agents support via `client.Beta.Agents` and related namespaces.
@@ -263,14 +394,65 @@ For placement patterns, architectural guidance, and the silent-invalidator audit
**When the user asks "how do I write the client code for X":** reach for `shared/managed-agents-client-patterns.md` — covers lossless stream reconnect, `processed_at` queued/processed gate, interrupt, `tool_confirmation` round-trip, the correct idle/terminated break gate, post-idle status race, stream-first ordering, file-mount gotchas, keeping credentials host-side via custom tools, etc.
**When the user wants the agent to run on a schedule** (cron, "every night", "weekly report"): read `shared/managed-agents-scheduled-deployments.md` — deployments fire sessions autonomously on a cron cadence, with run records, retries, and auto-pause.
**When the user wants the agent to run on a schedule** (cron, "every night", "weekly report"): read `shared/managed-agents-scheduled-deployments.md` — deployments fire sessions autonomously on a cron cadence, with per-firing run records and lifecycle controls (pause/unpause/archive).
---
## Server Tools (Quick Reference)
Server-side tools run on Anthropic's infrastructure — no client-side execution loop. Declare in `tools`; results arrive as content blocks in the same response. **No beta header** unless noted. **Prefer the latest type variant your model supports.** The `_20260209` web search / web fetch variants below (dynamic filtering) require Opus 4.8/4.7/4.6 or Sonnet 4.6; the basic variants for older models are listed after the table.
| Tool | `type` | `name` | Key optional params | Result block type |
|---|---|---|---|---|
| Web search | `web_search_20260209` | `web_search` | `max_uses`, `allowed_domains`/`blocked_domains`, `user_location` | `web_search_tool_result``.content` is a list of `web_search_result` |
| Web fetch | `web_fetch_20260209` | `web_fetch` | `max_uses`, `allowed_domains`/`blocked_domains`, `citations`, `max_content_tokens` | `web_fetch_tool_result``.content` is a `web_fetch_result` with a `document` block |
| Code execution | `code_execution_20260521` | `code_execution` | none | `bash_code_execution_tool_result``.content.stdout` / `.stderr` / `.return_code` |
| Tool search (regex) | `tool_search_tool_regex_20251119` | `tool_search_tool_regex` | mark other tools `defer_loading: true` | `tool_search_tool_result` |
| Tool search (BM25) | `tool_search_tool_bm25_20251119` | `tool_search_tool_bm25` | mark other tools `defer_loading: true` | `tool_search_tool_result` |
`web_search_20260209` / `web_fetch_20260209` have built-in dynamic filtering — code execution runs under the hood, so do **not** separately declare `code_execution` in `tools` (a second execution environment confuses the model). For models older than Opus 4.6 / Sonnet 4.6, use the basic variants `web_search_20250305` / `web_fetch_20250910` instead; on Vertex AI only basic `web_search_20250305` is available. `code_execution_20260120` (REPL persistence + programmatic tool calling) runs on Opus 4.5+ / Sonnet 4.5+. **Go SDK only**: `code_execution_20260521` lives under `client.Beta.Messages.New` with `Betas: []anthropic.AnthropicBeta{"code-execution-2025-08-25"}` (other languages use plain `client.messages.create`); `code_execution_20260120` uses the non-beta `client.Messages.New` in Go like everywhere else. Web fetch only fetches URLs already present in the conversation. Provider availability varies by tool — see `shared/platform-availability.md`. See `shared/tool-use-concepts.md` for `pause_turn` handling.
## Document & File Input (Quick Reference)
**PDF (base64, no beta):** `{"type": "document", "source": {"type": "base64", "media_type": "application/pdf", "data": <b64 string>}}` in user content, placed before the text block. Base64 string must have no newlines. Limits: 32 MB request, 600 pages (100 for 200k-context models). Java: `ContentBlockParam.ofDocument(DocumentBlockParam... Base64PdfSource.builder().data(...))`.
**Files API (beta `files-api-2025-04-14`):** upload via `client.beta.files.upload(...)` → response `id` is the `file_id`. Reference it as `{"type": "document", "source": {"type": "file", "file_id": "..."}}` for PDF/text, or `{"type": "image", ...}` for images — the content-block type must match the file's MIME type. The beta header is required on **both** the upload and the `messages.create` that references the file. Availability: `shared/platform-availability.md`.
**Citations (no beta):** set `citations: {enabled: true}` on each `document` content block (all or none). Response splits into multiple `text` blocks; cited blocks carry a `citations` array. Each citation has `cited_text`, `document_index`, `document_title`, and a location by `type`: `char_location` (`start_char_index`/`end_char_index`) for plain text, `page_location` (`start_page_number`/`end_page_number`, 1-indexed) for PDF, `content_block_location` for custom content. Incompatible with `output_config.format`.
## Tool Use Patterns (Quick Reference)
**Strict tool use (no beta):** set `strict: true` as a top-level field on the tool definition (alongside `name`/`description`/`input_schema`), **not** on `tool_choice`. Schema must have `additionalProperties: false` + `required`. Guarantees `tool_use.input` validates exactly. Go: `Strict: anthropic.Bool(true)` + `additionalProperties` via `InputSchema.ExtraFields`; Java: `.strict(true)` + `.putAdditionalProperty("additionalProperties", JsonValue.from(false))`.
**Parallel tool use (default on):** one assistant message may contain multiple `tool_use` blocks. Execute them concurrently, then return **all** `tool_result` blocks in a **single** user message (don't split across multiple messages). For a failed tool, return `tool_result` with `is_error: true` — don't drop it.
**Tool Runner (SDK beta helper):** drives the tool-call loop for you via `client.beta.messages.*`. Python: `@beta_tool` decorator + `client.beta.messages.tool_runner(...)``runner.until_done()`. TypeScript: `betaZodTool({...})` from `@anthropic-ai/sdk/helpers/beta/zod` + `client.beta.messages.toolRunner(...)``await runner`. Go: `toolrunner.NewBetaToolFromJSONSchema(...)` + `client.Beta.Messages.NewToolRunner(...)``.RunToCompletion(ctx)`. Java requires `.addBeta("structured-outputs-2025-11-13")`. Ruby: `Anthropic::BaseTool` subclass + `client.beta.messages.tool_runner(...)`. PHP: `BetaRunnableTool` + `->toolRunner(...)`. C#: raw JSON-schema tools + `BetaToolRunner` via `client.Beta.Messages.ToolRunner(...)`.
**Programmatic tool calling (no beta header):** Claude calls your custom tool from inside code execution. Add `{"type": "code_execution_20260120", "name": "code_execution"}` **and** set `"allowed_callers": ["code_execution_20260120"]` on your custom tool. Opus 4.5+ / Sonnet 4.5+ (availability: `shared/platform-availability.md`). When responding to a pending programmatic call, the user message must contain **only** `tool_result` blocks (no text). Not compatible with `strict: true`, `disable_parallel_tool_use`, forced `tool_choice`, or MCP tools.
## Other API Surfaces (Quick Reference)
**Message Batches (no beta; availability: `shared/platform-availability.md`):** `client.messages.batches.create(requests=[{custom_id, params}, ...])` → poll `client.messages.batches.retrieve(id).processing_status` until `"ended"` → stream `client.messages.batches.results(id)`. Each result has `.custom_id` + `.result.type` (`succeeded`/`errored`/`canceled`/`expired`); on success read `.result.message.content`. Python wraps requests as `Request(custom_id=..., params=MessageCreateParamsNonStreaming(...))`. Results arrive in **any order** — key by `custom_id`, never by position.
**Models API (no beta; availability: `shared/platform-availability.md`):** `client.models.list()` (auto-paginates) and `client.models.retrieve("claude-opus-4-8")`. Each model object has `id`, `display_name`, `created_at`, and — since Mar 2026 — `max_input_tokens` (the context window), `max_tokens` (the output cap), and `capabilities`. There is no `context_window` field.
**Stop details (GA, Opus 4.7+):** `response.stop_details` is populated **only when `stop_reason == "refusal"`** (fields: `type: "refusal"`, `category: "cyber"|"bio"|null`, `explanation`). It is `null` for every other `stop_reason` (`end_turn`, `max_tokens`, `tool_use`, `pause_turn`, …) — always guard before reading.
**Client config (no beta):** `timeout` default 10 min; **units differ by SDK** — Python/Ruby: seconds; TypeScript: **milliseconds**; Go `option.WithRequestTimeout(time.Duration)`; Java `Duration`; C# `TimeSpan`. TS scales the default up to 60 min for large `max_tokens` on non-streaming requests; Java does so for streaming requests (Java non-streaming scales 30s10 min). `max_retries`/`maxRetries` default 2 (retries 408/409/429/5xx + connection errors). `base_url` (or `ANTHROPIC_BASE_URL` env). Per-request override: Python `client.with_options(timeout=5.0).messages.create(...)`; TS `client.messages.create({...}, {timeout: 5_000})`; Ruby `request_options: {timeout: 5}`. Timeouts are retried — wall-clock can reach `timeout × (max_retries+1)`.
## Workload Identity Federation (Quick Reference)
**GA, no beta header.** Construct the normal zero-arg client (`Anthropic()` / `new Anthropic()` / `anthropic.NewClient()` / `AnthropicOkHttpClient.fromEnv()`); the SDK auto-detects WIF when **all** of `ANTHROPIC_FEDERATION_RULE_ID`, `ANTHROPIC_ORGANIZATION_ID`, `ANTHROPIC_SERVICE_ACCOUNT_ID`, and `ANTHROPIC_IDENTITY_TOKEN_FILE` (or `ANTHROPIC_IDENTITY_TOKEN`) are set, exchanges the JWT at `/v1/oauth/token`, and auto-refreshes. `ANTHROPIC_WORKSPACE_ID` does not gate activation — required only when the federation rule spans multiple workspaces (else 400 `workspace_id_required`), optional for single-workspace rules. `ANTHROPIC_API_KEY` or `ANTHROPIC_AUTH_TOKEN` (even empty) outrank WIF, and a set `ANTHROPIC_PROFILE` also wins over the federation env vars (a missing named profile is an error, not a fall-through) — unset all three.
---
## Reading Guide
After detecting the language, read the relevant files based on what the user needs:
After detecting the language, read the relevant files based on what the user needs.
**All SDK languages use the same multi-file layout** — directory `{lang}/claude-api/` containing `README.md` (install, client init, basic request, thinking, caching, stop details, misc), `tool-use.md` (tool definitions, agentic loop, Anthropic-defined tools, structured outputs), `streaming.md`, `batches.md`, `files-api.md`. Not every language has every file (e.g., Ruby has no `batches.md`); if a file is absent, that feature's example is not yet documented for that language — fall back to the cURL shape or WebFetch the SDK repo from `shared/live-sources.md`. **cURL**`curl/examples.md`.
The Quick Task Reference below uses the `{lang}/claude-api/FILE.md` path notation for all languages.
### Quick Task Reference
@@ -304,11 +486,11 @@ After detecting the language, read the relevant files based on what the user nee
→ Read `{lang}/claude-api/README.md` + `{lang}/claude-api/files-api.md`
**Managed Agents (server-managed stateful agents with workspace):**
→ Read `shared/managed-agents-overview.md` + the rest of the `shared/managed-agents-*.md` files. For Python, TypeScript, Go, Ruby, PHP, and Java, read `{lang}/managed-agents/README.md` for code examples. For cURL, read `curl/managed-agents.md`. **Agents are persistent — create once, reference by ID.** Store the agent ID returned by `agents.create` and pass it to every subsequent `sessions.create`; do not call `agents.create` in the request path. The Anthropic CLI (`ant`) is one convenient way to create agents and environments from version-controlled YAML — see `shared/anthropic-cli.md`. If a binding you need isn't shown in the language README, WebFetch the relevant entry from `shared/live-sources.md` rather than guess. C# has beta Managed Agents support — see `csharp/claude-api.md` for details, or `curl/managed-agents.md` for raw HTTP reference.
→ Read `shared/managed-agents-overview.md` + the rest of the `shared/managed-agents-*.md` files. For Python, TypeScript, Go, Ruby, PHP, and Java, read `{lang}/managed-agents/README.md` for code examples. For cURL, read `curl/managed-agents.md`. **Agents are persistent — create once, reference by ID.** Store the agent ID returned by `agents.create` and pass it to every subsequent `sessions.create`; do not call `agents.create` in the request path. The Anthropic CLI (`ant`) is one convenient way to create agents and environments from version-controlled YAML — see `shared/anthropic-cli.md`. If a binding you need isn't shown in the language README, WebFetch the relevant entry from `shared/live-sources.md` rather than guess. C# has beta Managed Agents support — see `csharp/claude-api/README.md` for details, or `curl/managed-agents.md` for raw HTTP reference.
### Claude API (Full File Reference)
Read the **language-specific Claude API folder** (`{language}/claude-api/`):
Read the **language-specific Claude API source** `{language}/claude-api/` for every SDK language, `curl/examples.md` for cURL:
1. **`{language}/claude-api/README.md`** — **Read this first.** Installation, quick start, common patterns, error handling.
2. **`shared/tool-use-concepts.md`** — Read when the user needs function calling, code execution, memory, or structured outputs. Covers conceptual foundations.
@@ -318,11 +500,11 @@ Read the **language-specific Claude API folder** (`{language}/claude-api/`):
6. **`{language}/claude-api/batches.md`** — Read when processing many requests offline (not latency-sensitive). Runs asynchronously at 50% cost.
7. **`{language}/claude-api/files-api.md`** — Read when sending the same file across multiple requests without re-uploading.
8. **`shared/prompt-caching.md`** — Read when adding or optimizing prompt caching. Covers prefix-stability design, breakpoint placement, and anti-patterns that silently invalidate cache.
9. **`shared/error-codes.md`** — Read when debugging HTTP errors or implementing error handling.
9. **`shared/error-codes.md`** — Read when debugging HTTP errors or implementing error handling. Includes the per-SDK typed exception class table and the Go `errors.As` pattern.
10. **`shared/model-migration.md`** — Read when upgrading to newer models, replacing retired models, or translating `budget_tokens` / prefill patterns to the current API.
11. **`shared/live-sources.md`** — WebFetch URLs for fetching the latest official documentation.
> **Note:** For Java, Go, Ruby, C#, PHP, and cURL — these have a single file each covering all basics. Read that file plus `shared/tool-use-concepts.md` and `shared/error-codes.md` as needed.
Not every language has every file (e.g., Ruby has no `batches.md`); if a file is absent, that feature's example is not yet documented for that language.
> **Note:** For the Managed Agents file reference, see the `## Managed Agents (Beta)` section above — it lists every `shared/managed-agents-*.md` file and the language-specific READMEs.
@@ -344,13 +526,36 @@ Live documentation URLs are in `shared/live-sources.md`.
- **Fable 5 / Opus 4.8 / 4.7 thinking:** Adaptive only. `thinking: {type: "enabled", budget_tokens: N}` returns 400 — `budget_tokens` is fully removed (along with `temperature`, `top_p`, `top_k`). Use `thinking: {type: "adaptive"}`. Opus 4.8 inherits this surface from 4.7 with no new breaking changes; Fable 5 adds one — an explicit `thinking: {type: "disabled"}` returns a 400 (accepted on 4.7/4.8); omit the param instead.
- **Opus 4.6 / Sonnet 4.6 thinking:** Use `thinking: {type: "adaptive"}` — do NOT use `budget_tokens` for new 4.6 code (deprecated on both Opus 4.6 and Sonnet 4.6; for gradual migration of existing code, see the transitional escape hatch in `shared/model-migration.md` — note this carve-out does not apply to Fable 5, Opus 4.7 or 4.8). For older models, `budget_tokens` must be less than `max_tokens` (minimum 1024). This will throw an error if you get it wrong.
- **Prefill removed (Fable 5 and the 4.6/4.7/4.8 family):** Assistant message prefills (last-assistant-turn prefills) return a 400 error on Fable 5, Opus 4.6, Opus 4.7, Opus 4.8, and Sonnet 4.6. Use structured outputs (`output_config.format`) or system prompt instructions to control response format instead. (One exception: the fallback-credit prefill claim — when redeeming a credit with `fallback_has_prefill_claim: true`, the server accepts the echoed assistant message; see the migration guide's refusal section.)
- **Fable 5 `refusal` stop reason:** Safety classifiers may decline a request — a successful HTTP 200 with `stop_reason: "refusal"` (pre-output: empty `content`, nothing billed; mid-stream: partial output billed — discard it). Check `stop_reason` before reading `response.content[0]`, or you'll hit index errors on refused requests. To retry on another model, replaying history as-is works — other models silently ignore the refused model's thinking blocks — but ignored blocks still bill input tokens, so strip them when switching for good (exception: a fallback-credit redemption must echo the refused body exactly, thinking blocks included).
- **Fable 5 tokenizer:** ~30% more tokens for the same content vs Opus-tier models. Token counts, context-window budgets, and `max_tokens` values measured on other models don't transfer — re-measure with `count_tokens` passing `model: "claude-fable-5"` (the response includes counts under both tokenizers).
- **Fable 5 `refusal` stop reason:** Safety classifiers may decline a request — a successful HTTP 200 with `stop_reason: "refusal"` (pre-output: empty `content`, nothing billed; mid-stream: partial output billed — discard it). Check `stop_reason` before reading `response.content[0]`, or you'll hit index errors on refused requests. To retry on another model, replay the history as-is — other models drop the refused model's thinking blocks from the prompt, unbilled; no stripping needed (and a fallback-credit redemption must echo the refused body exactly anyway, thinking blocks included). Fallbacks are **opt-in** — new `claude-fable-5` code should include the server-side `fallbacks` parameter by default so a refusal doesn't fail the request outright; see the Claude Fable 5 section above.
- **Fable 5 tokenizer:** Same tokenizer as Opus 4.8 — token counts are roughly unchanged when migrating from Opus 4.7/4.8. Coming from Opus 4.6, Sonnet, Haiku, or older, token counts differ (the Opus 4.7 tokenizer uses ~1×1.35× as many tokens) — re-measure by calling `count_tokens` once with each model and comparing `input_tokens`.
- **Confirm migration scope before editing:** When a user asks to migrate code to a newer Claude model without naming a specific file, directory, or file list, **ask which scope to apply first** — the entire working directory, a specific subdirectory, or a specific set of files. Do not start editing until the user confirms. Imperative phrasings like "migrate my codebase", "move my project to X", "upgrade to Sonnet 4.6", or bare "migrate to Opus 4.8" are **still ambiguous** — they tell you what to do but not where, so ask. Proceed without asking only when the prompt names an exact file, a specific directory, or an explicit file list ("migrate `app.py`", "migrate everything under `services/`", "update `a.py` and `b.py`"). See `shared/model-migration.md` Step 0.
- **`max_tokens` defaults:** Don't lowball `max_tokens` — hitting the cap truncates output mid-thought and requires a retry. For non-streaming requests, default to `~16000` (keeps responses under SDK HTTP timeouts). For streaming requests, default to `~64000` (timeouts aren't a concern, so give the model room). Only go lower when you have a hard reason: classification (`~256`), cost caps, deliberately short outputs, or **`max_tokens: 0`** for cache pre-warming (see `shared/prompt-caching.md` → Pre-warming).
- **128K output tokens:** Fable 5, Opus 4.6, Opus 4.7, and Opus 4.8 support up to 128K `max_tokens`, but the SDKs require streaming for values that large to avoid HTTP timeouts. Use `.stream()` with `.get_final_message()` / `.finalMessage()`.
- **Tool call JSON parsing (Fable 5 and the 4.6/4.7/4.8 family):** Fable 5, Opus 4.6, Opus 4.7, Opus 4.8, and Sonnet 4.6 may produce different JSON string escaping in tool call `input` fields (e.g., Unicode or forward-slash escaping). Always parse tool inputs with `json.loads()` / `JSON.parse()` — never do raw string matching on the serialized input.
- **Structured outputs (all models):** Use `output_config: {format: {...}}` instead of the deprecated `output_format` parameter on `messages.create()`. This is a general API change, not 4.6-specific.
- **Don't reimplement SDK functionality:** The SDK provides high-level helpers — use them instead of building from scratch. Specifically: use `stream.finalMessage()` instead of wrapping `.on()` events in `new Promise()`; use typed exception classes (`Anthropic.RateLimitError`, etc.) instead of string-matching error messages; use SDK types (`Anthropic.MessageParam`, `Anthropic.Tool`, `Anthropic.Message`, etc.) instead of redefining equivalent interfaces.
- **Error handling — catch a chain, not one broad class.** A single `except APIStatusError` / `catch (AnthropicServiceException)` / `rescue APIError` loses the distinction between retryable (429, ≥500, network) and non-retryable (400/404) failures. Write a most-specific-first chain — e.g. `NotFoundError``RateLimitError``APIStatusError``APIConnectionError` (or the Go equivalent: `errors.As` into `*anthropic.Error` then `switch apierr.StatusCode { case 404: …; case 429: …; default: … }`). Per-language class names and namespaces are in `shared/error-codes.md`.
- **Don't research SDK types — write first.** If a type name isn't shown in the documentation included in this skill, write the code file from the namespace/package tables in the language-specific doc and let the compiler's error point you to the right name. Do not spend turns on WebFetch, SDK-repo clones, or compiling-and-running a separate reflection program to discover type names before writing — produce the source file first, then fix what the compiler reports. A quick `strings` / `jar tf` / `javap` against the installed SDK is acceptable for locating names (it returns in seconds), but don't escalate beyond that. A file with a wrong type name is recoverable; a session spent on discovery with no file written is not.
- **Bash and text editor tools are Anthropic-defined, schema-less.** Declare `{"type": "bash_20250124", "name": "bash"}` / `{"type": "text_editor_20250728", "name": "str_replace_based_edit_tool"}` — no `input_schema`. A custom tool with your own schema named `"bash"` is a different tool. Handler paths and security checks are in `shared/tool-use-concepts.md` § Client-Side Tools.
- **Advisor tool model pairing.** The advisor tool's `model` must be at least as capable as the request's top-level `model` — e.g. executor `claude-sonnet-4-6` → advisor `claude-opus-4-8` or `claude-opus-4-7`. An invalid pair returns 400. Pairing table in `shared/tool-use-concepts.md` § Advisor. Availability: `shared/platform-availability.md`.
- **Agent Skills ≠ Managed Agents.** To have Claude generate a `.pptx`/`.xlsx`/etc. via Agent Skills, call `client.beta.messages.create` with `container={"skills": [...]}`, the `code_execution_20260521` tool, and both `code-execution-2025-08-25` + `skills-2025-10-02` betas. Do not use `client.beta.agents` / `sessions` / `environments` here — those are the Managed Agents surface, not Agent Skills.
- **MCP connector needs both halves.** `mcp_servers=[{type:"url", url, name}]` alone is rejected as a validation error — also add `tools=[{type:"mcp_toolset", mcp_server_name:<same name>}]` with beta `mcp-client-2025-11-20`. Availability: `shared/platform-availability.md`.
- **Context editing ≠ compaction.** Context editing *clears* tool results and thinking blocks; compaction *summarizes* history. For context editing, use `context_management.edits` with type `clear_tool_uses_20250919` (or `clear_thinking_20251015`) on `client.beta.messages.*` with beta `context-management-2025-06-27` — not the `compact_20260112` type or `compact-2026-01-12` beta, which are compaction.
- **`inference_geo` is a direct top-level request parameter** — `client.messages.create(..., inference_geo="us")` / `.inferenceGeo("us")`. Do not put it in `extra_body` / `putAdditionalBodyProperty`. Supported on Opus 4.6 / Sonnet 4.6 and later; availability: `shared/platform-availability.md`. `response.usage.inference_geo` reports where inference ran.
- **Fine-grained tool streaming is not a beta feature.** Set `eager_input_streaming: true` on the tool definition and call the regular `client.messages.stream(...)`. There is no beta header and no `client.beta.*` path.
- **Cache diagnostics is beta.** Use `client.beta.messages.*` with beta `cache-diagnosis-2026-04-07`. Pass `diagnostics: {previous_message_id: null}` on the first turn and `diagnostics: {previous_message_id: <previous response id>}` on subsequent turns; the result is on `response.diagnostics`. Availability: `shared/platform-availability.md`.
- **Memory tool type is `memory_20250818`.** Declare `{"type": "memory_20250818", "name": "memory"}`. Go uses the beta-namespace type `{OfMemoryTool20250818: &anthropic.BetaMemoryTool20250818Param{}}` on `client.Beta.Messages.New`; Python/TypeScript/Ruby/PHP/C# use the non-beta `client.messages.create`; Java has both a non-beta `MemoryTool20250818` and a beta tool-runner path. Python/TypeScript provide `BetaAbstractMemoryTool` / `betaMemoryTool` helpers for implementing the backend.
- **Use a model the feature actually supports.** Some features are restricted to specific model tiers — fast mode is Opus 4.8 / 4.7 only, task budgets are Fable 5 / Opus 4.8 / 4.7 only, and the advisor tool requires a valid executor↔advisor pair. If the user's prompt names a model that the feature doesn't support, use a supported model instead and note the substitution in the output.
- **Bedrock / Foundry: use the platform client class.** For Bedrock use the `…BedrockMantle…` client (e.g. Python `AnthropicBedrockMantle`, Java `BedrockMantleBackend`) with `anthropic.`-prefixed model IDs; `AnthropicBedrock`/`BedrockBackend` without `Mantle` is the legacy path. For Foundry use `AnthropicFoundry` / `FoundryBackend` / `AnthropicFoundryClient` where the SDK supports it (C#, Java, PHP, Python, TypeScript); Go and Ruby have no Foundry client — Ruby's documented fallback is the first-party client with a custom `base_url`. Per-language table above.
- **Don't define custom types for SDK data structures:** The SDK exports types for all API objects. Use `Anthropic.MessageParam` for messages, `Anthropic.Tool` for tool definitions, `Anthropic.ToolUseBlock` / `Anthropic.ToolResultBlockParam` for tool results, `Anthropic.Message` for responses. Defining your own `interface ChatMessage { role: string; content: unknown }` duplicates what the SDK already provides and loses type safety.
- **Report and document output:** For tasks that produce reports, documents, or visualizations, the code execution sandbox has `python-docx`, `python-pptx`, `matplotlib`, `pillow`, and `pypdf` pre-installed. Claude can generate formatted files (DOCX, PDF, charts) and return them via the Files API — consider this for "report" or "document" type requests instead of plain stdout text.
- **Server-tool errors don't raise.** Web search and web fetch errors return HTTP 200 with a `web_search_tool_result` / `web_fetch_tool_result` block whose `content` is a single error object (e.g. `{error_code: "max_uses_exceeded"}`) — not a raised exception. For web search, a success `content` is a *list*; an error `content` is an *object* — branch on that before indexing.
- **Code execution output block type:** `code_execution_20260521` returns `bash_code_execution_tool_result` (with `.content.stdout`), **not** the legacy bare `code_execution_tool_result`. Iterate `response.content` and match on the correct type.
- **Tool search: never defer everything.** The search tool itself must not have `defer_loading: true`, and at least one tool in `tools` must be non-deferred, or the API returns 400 `All tools have defer_loading set`.
- **`strict: true` goes on the tool, not `tool_choice`.** Putting `strict` on `tool_choice` does nothing; it's a sibling of `name`/`description`/`input_schema` on the tool definition itself.
- **Parallel tool results go in ONE user message.** Splitting `tool_result` blocks across multiple user messages silently trains Claude to stop making parallel calls. One assistant message of `tool_use` blocks → one user message of `tool_result` blocks.
- **Citations + structured outputs are incompatible.** Enabling `citations: {enabled: true}` on a document while also setting `output_config.format` returns a 400.
- **Batch results are unordered.** Match by `custom_id`, never by position in the results stream.
- **Vertex model IDs have no prefix.** Unlike Bedrock's `anthropic.`-prefixed IDs, Vertex takes the bare first-party ID for current-generation models (e.g. `"claude-opus-4-8"`); dated-snapshot models use an `@` separator (e.g. `claude-haiku-4-5@20251001`).
- **`stop_details` is `null` unless `stop_reason == "refusal"`.** For `max_tokens`, `end_turn`, etc., `stop_details` is `null` — guard before reading `.category`.
- **WIF auth: unset `ANTHROPIC_API_KEY`, `ANTHROPIC_AUTH_TOKEN`, and `ANTHROPIC_PROFILE`.** `ANTHROPIC_API_KEY` and `ANTHROPIC_AUTH_TOKEN` (even set to `""`) outrank Workload Identity Federation in the SDK's precedence chain and silently win; a set `ANTHROPIC_PROFILE` also wins (a missing named profile is an error, not a fall-through). `unset` them, don't blank them.

View File

@@ -1,447 +0,0 @@
# Claude API — C#
> **Note:** The C# SDK is the official Anthropic SDK for C#. Tool use is supported via the Messages API with a beta `BetaToolRunner` for automatic tool execution loops. The SDK also supports Microsoft.Extensions.AI IChatClient integration with function invocation and Managed Agents (beta).
## Installation
```bash
dotnet add package Anthropic
```
## Client Initialization
```csharp
using Anthropic;
// Default (uses ANTHROPIC_API_KEY env var)
AnthropicClient client = new();
// Explicit API key (use environment variables — never hardcode keys)
AnthropicClient client = new() {
ApiKey = Environment.GetEnvironmentVariable("ANTHROPIC_API_KEY")
};
```
---
## Basic Message Request
```csharp
using Anthropic.Models.Messages;
var parameters = new MessageCreateParams
{
Model = Model.ClaudeOpus4_6,
MaxTokens = 16000,
Messages = [new() { Role = Role.User, Content = "What is the capital of France?" }]
};
var response = await client.Messages.Create(parameters);
// ContentBlock is a union wrapper. .Value unwraps to the variant object,
// then OfType<T> filters to the type you want. Or use the TryPick* idiom
// shown in the Thinking section below.
foreach (var text in response.Content.Select(b => b.Value).OfType<TextBlock>())
{
Console.WriteLine(text.Text);
}
```
---
## Streaming
```csharp
using Anthropic.Models.Messages;
var parameters = new MessageCreateParams
{
Model = Model.ClaudeOpus4_6,
MaxTokens = 64000,
Messages = [new() { Role = Role.User, Content = "Write a haiku" }]
};
await foreach (RawMessageStreamEvent streamEvent in client.Messages.CreateStreaming(parameters))
{
if (streamEvent.TryPickContentBlockDelta(out var delta) &&
delta.Delta.TryPickText(out var text))
{
Console.Write(text.Text);
}
}
```
**`RawMessageStreamEvent` TryPick methods** (naming drops the `Message`/`Raw` prefix): `TryPickStart`, `TryPickDelta`, `TryPickStop`, `TryPickContentBlockStart`, `TryPickContentBlockDelta`, `TryPickContentBlockStop`. There is no `TryPickMessageStop` — use `TryPickStop`.
---
## Thinking
**Adaptive thinking is the recommended mode for Claude 4.6+ models.** Claude decides dynamically when and how much to think.
```csharp
using Anthropic.Models.Messages;
var response = await client.Messages.Create(new MessageCreateParams
{
Model = Model.ClaudeOpus4_6,
MaxTokens = 16000,
// ThinkingConfigParam? implicitly converts from the concrete variant classes —
// no wrapper needed.
Thinking = new ThinkingConfigAdaptive(),
Messages =
[
new() { Role = Role.User, Content = "Solve: 27 * 453" },
],
});
// ThinkingBlock(s) precede TextBlock in Content. TryPick* narrows the union.
foreach (var block in response.Content)
{
if (block.TryPickThinking(out ThinkingBlock? t))
{
Console.WriteLine($"[thinking] {t.Thinking}");
}
else if (block.TryPickText(out TextBlock? text))
{
Console.WriteLine(text.Text);
}
}
```
> **Deprecated:** `new ThinkingConfigEnabled { BudgetTokens = N }` (fixed-budget extended thinking) still works on Claude 4.6 but is deprecated. Use adaptive thinking above.
Alternative to `TryPick*`: `.Select(b => b.Value).OfType<ThinkingBlock>()` (same LINQ pattern as the Basic Message example).
---
## Tool Use
### Defining a tool
`Tool` (NOT `ToolParam`) with an `InputSchema` record. `InputSchema.Type` is auto-set to `"object"` by the constructor — don't set it. `ToolUnion` has an implicit conversion from `Tool`, triggered by the collection expression `[...]`.
```csharp
using System.Text.Json;
using Anthropic.Models.Messages;
var parameters = new MessageCreateParams
{
Model = Model.ClaudeSonnet4_6,
MaxTokens = 16000,
Tools = [
new Tool {
Name = "get_weather",
Description = "Get the current weather in a given location",
InputSchema = new() {
Properties = new Dictionary<string, JsonElement> {
["location"] = JsonSerializer.SerializeToElement(
new { type = "string", description = "City name" }),
},
Required = ["location"],
},
},
],
Messages = [new() { Role = Role.User, Content = "Weather in Paris?" }],
};
```
Derived from `anthropic-sdk-csharp/src/Anthropic/Models/Messages/Tool.cs` and `ToolUnion.cs:799` (implicit conversion).
See [shared tool use concepts](../shared/tool-use-concepts.md) for the loop pattern.
### Converting response content to the follow-up assistant message
When echoing Claude's response back in the assistant turn, **there is no `.ToParam()` helper** — manually reconstruct each `ContentBlock` variant as its `*Param` counterpart. Do NOT use `new ContentBlockParam(block.Json)`: it compiles and serializes, but `.Value` stays `null` so `TryPick*`/`Validate()` fail (degraded JSON pass-through, not the typed path).
```csharp
using Anthropic.Models.Messages;
Message response = await client.Messages.Create(parameters);
// No .ToParam() — reconstruct per variant. Implicit conversions from each
// *Param type to ContentBlockParam mean no explicit wrapper.
List<ContentBlockParam> assistantContent = [];
List<ContentBlockParam> toolResults = [];
foreach (ContentBlock block in response.Content)
{
if (block.TryPickText(out TextBlock? text))
{
assistantContent.Add(new TextBlockParam { Text = text.Text });
}
else if (block.TryPickThinking(out ThinkingBlock? thinking))
{
// Signature MUST be preserved — the API rejects tampering
assistantContent.Add(new ThinkingBlockParam
{
Thinking = thinking.Thinking,
Signature = thinking.Signature,
});
}
else if (block.TryPickRedactedThinking(out RedactedThinkingBlock? redacted))
{
assistantContent.Add(new RedactedThinkingBlockParam { Data = redacted.Data });
}
else if (block.TryPickToolUse(out ToolUseBlock? toolUse))
{
// ToolUseBlock has required Caller; ToolUseBlockParam.Caller is optional — don't copy it
assistantContent.Add(new ToolUseBlockParam
{
ID = toolUse.ID,
Name = toolUse.Name,
Input = toolUse.Input,
});
// Execute the tool; collect ONE result per tool_use block — the API
// rejects the follow-up if any tool_use ID lacks a matching tool_result.
string result = ExecuteYourTool(toolUse.Name, toolUse.Input);
toolResults.Add(new ToolResultBlockParam
{
ToolUseID = toolUse.ID,
Content = result,
});
}
}
// Follow-up: prior messages + assistant echo + user tool_result(s)
List<MessageParam> followUpMessages =
[
.. parameters.Messages,
new() { Role = Role.Assistant, Content = assistantContent },
new() { Role = Role.User, Content = toolResults },
];
```
`ToolResultBlockParam` has no tuple constructor — use the object initializer. `Content` is a string-or-list union; a plain `string` implicitly converts.
---
## Context Editing / Compaction (Beta)
**Beta-namespace prefix is inconsistent** (source-verified against `src/Anthropic/Models/Beta/Messages/*.cs` @ 12.9.0). No prefix: `MessageCreateParams`, `MessageCountTokensParams`, `Role`. **Everything else has the `Beta` prefix**: `BetaMessageParam`, `BetaMessage`, `BetaContentBlock`, `BetaToolUseBlock`, all block param types. The unprefixed `Role` WILL collide with `Anthropic.Models.Messages.Role` if you import both namespaces (CS0104). Safest: import only Beta; if mixing, alias the beta `Role`:
```csharp
using Anthropic.Models.Beta.Messages;
using NonBeta = Anthropic.Models.Messages; // only if you also need non-beta types
// Now: MessageCreateParams, BetaMessageParam, Role (beta's), NonBeta.Role (if needed)
```
`BetaMessage.Content` is `IReadOnlyList<BetaContentBlock>` — a 15-variant discriminated union. Narrow with `TryPick*`. **Response `BetaContentBlock` is NOT assignable to param `BetaContentBlockParam`** — there's no `.ToParam()` in C#. Round-trip by converting each block:
```csharp
using Anthropic.Models.Beta.Messages;
var betaParams = new MessageCreateParams // no Beta prefix — one of only 2 unprefixed
{
Model = Model.ClaudeOpus4_6,
MaxTokens = 16000,
Betas = ["compact-2026-01-12"],
ContextManagement = new BetaContextManagementConfig
{
Edits = [new BetaCompact20260112Edit()],
},
Messages = messages,
};
BetaMessage resp = await client.Beta.Messages.Create(betaParams);
foreach (BetaContentBlock block in resp.Content)
{
if (block.TryPickCompaction(out BetaCompactionBlock? compaction))
{
// Content is nullable — compaction can fail server-side
Console.WriteLine($"compaction summary: {compaction.Content}");
}
}
// Context-edit metadata lives on a separate nullable field
if (resp.ContextManagement is { } ctx)
{
foreach (var edit in ctx.AppliedEdits)
Console.WriteLine($"cleared {edit.ClearedInputTokens} tokens");
}
// ROUND-TRIP: BetaMessageParam.Content is BetaMessageParamContent (a string|list
// union). It implicit-converts from List<BetaContentBlockParam>, NOT from the
// response's IReadOnlyList<BetaContentBlock>. Convert each block:
List<BetaContentBlockParam> paramBlocks = [];
foreach (var b in resp.Content)
{
if (b.TryPickText(out var t)) paramBlocks.Add(new BetaTextBlockParam { Text = t.Text });
else if (b.TryPickCompaction(out var c)) paramBlocks.Add(new BetaCompactionBlockParam { Content = c.Content });
// ... other variants as needed
}
messages.Add(new BetaMessageParam { Role = Role.Assistant, Content = paramBlocks });
```
All 15 `BetaContentBlock.TryPick*` variants: `Text`, `Thinking`, `RedactedThinking`, `ToolUse`, `ServerToolUse`, `WebSearchToolResult`, `WebFetchToolResult`, `CodeExecutionToolResult`, `BashCodeExecutionToolResult`, `TextEditorCodeExecutionToolResult`, `ToolSearchToolResult`, `McpToolUse`, `McpToolResult`, `ContainerUpload`, `Compaction`.
**`BetaToolUseBlock.Input` is `IReadOnlyDictionary<string, JsonElement>`** — index by key then call the `JsonElement` extractor:
```csharp
if (block.TryPickToolUse(out BetaToolUseBlock? tu))
{
int a = tu.Input["a"].GetInt32();
string s = tu.Input["name"].GetString()!;
}
```
---
## Effort Parameter
Effort is nested under `OutputConfig`, NOT a top-level property. `ApiEnum<string, Effort>` has an implicit conversion from the enum, so assign `Effort.High` directly.
```csharp
OutputConfig = new OutputConfig { Effort = Effort.High },
```
Values: `Effort.Low`, `Effort.Medium`, `Effort.High`, `Effort.Max`. Combine with `Thinking = new ThinkingConfigAdaptive()` for cost-quality control.
---
## Prompt Caching
`System` takes `MessageCreateParamsSystem?` — a union of `string` or `List<TextBlockParam>`. There is no `SystemTextBlockParam`; use plain `TextBlockParam`. The implicit conversion needs the concrete `List<TextBlockParam>` type (array literals won't convert). For placement patterns and the silent-invalidator audit checklist, see `shared/prompt-caching.md`.
```csharp
System = new List<TextBlockParam> {
new() {
Text = longSystemPrompt,
CacheControl = new CacheControlEphemeral(), // auto-sets Type = "ephemeral"
},
},
```
Optional `Ttl` on `CacheControlEphemeral`: `new() { Ttl = Ttl.Ttl1h }` or `Ttl.Ttl5m`. `CacheControl` also exists on `Tool.CacheControl` and top-level `MessageCreateParams.CacheControl`.
Verify hits via `response.Usage.CacheCreationInputTokens` / `response.Usage.CacheReadInputTokens`.
---
## Token Counting
```csharp
MessageTokensCount result = await client.Messages.CountTokens(new MessageCountTokensParams {
Model = Model.ClaudeOpus4_6,
Messages = [new() { Role = Role.User, Content = "Hello" }],
});
long tokens = result.InputTokens;
```
`MessageCountTokensParams.Tools` uses a different union type (`MessageCountTokensTool`) than `MessageCreateParams.Tools` (`ToolUnion`) — if you're passing tools, the compiler will tell you when it matters.
---
## Structured Output
```csharp
OutputConfig = new OutputConfig {
Format = new JsonOutputFormat {
Schema = new Dictionary<string, JsonElement> {
["type"] = JsonSerializer.SerializeToElement("object"),
["properties"] = JsonSerializer.SerializeToElement(
new { name = new { type = "string" } }),
["required"] = JsonSerializer.SerializeToElement(new[] { "name" }),
},
},
},
```
`JsonOutputFormat.Type` is auto-set to `"json_schema"` by the constructor. `Schema` is `required`.
---
## PDF / Document Input
`DocumentBlockParam` takes a `DocumentBlockParamSource` union: `Base64PdfSource` / `UrlPdfSource` / `PlainTextSource` / `ContentBlockSource`. `Base64PdfSource` auto-sets `MediaType = "application/pdf"` and `Type = "base64"`.
```csharp
new MessageParam {
Role = Role.User,
Content = new List<ContentBlockParam> {
new DocumentBlockParam { Source = new Base64PdfSource { Data = base64String } },
new TextBlockParam { Text = "Summarize this PDF" },
},
}
```
---
## Server-Side Tools
Web search, bash, text editor, and code execution are built-in server tools. Type names are version-suffixed; constructors auto-set `name`/`type`. All implicit-convert to `ToolUnion`.
```csharp
Tools = [
new WebSearchTool20260209(),
new ToolBash20250124(),
new ToolTextEditor20250728(),
new CodeExecutionTool20260120(),
],
```
Also available: `WebFetchTool20260209`, `MemoryTool20250818`. `WebSearchTool20260209` optionals: `AllowedDomains`, `BlockedDomains`, `MaxUses`, `UserLocation`.
---
## Files API (Beta)
Files live under `client.Beta.Files` (namespace `Anthropic.Models.Beta.Files`). `BinaryContent` implicit-converts from `Stream` and `byte[]`.
```csharp
using Anthropic.Models.Beta.Files;
using Anthropic.Models.Beta.Messages;
FileMetadata meta = await client.Beta.Files.Upload(
new FileUploadParams { File = File.OpenRead("doc.pdf") });
// Referencing the uploaded file requires Beta message types:
new BetaRequestDocumentBlock {
Source = new BetaFileDocumentSource { FileID = meta.ID },
}
```
The non-beta `DocumentBlockParamSource` union has no file-ID variant — file references need `client.Beta.Messages.Create()`.
---
## Tool Runner (Beta)
The C# SDK provides a `BetaToolRunner` for automatic tool execution loops. Define tools with raw JSON schemas, and the runner handles the API call → tool execution → result feedback loop.
```csharp
using Anthropic.Models.Beta.Messages;
// Define tools and create params as shown in the Tool Use section above,
// but using the beta namespace types (BetaToolUnion, etc.)
var runner = client.Beta.Messages.ToolRunner(betaParams);
await foreach (BetaMessage message in runner)
{
foreach (var block in message.Content)
{
if (block.TryPickText(out var text))
{
Console.WriteLine(text.Text);
}
}
}
```
---
## Stop Details
When `StopReason` is `"refusal"`, the response includes structured `StopDetails`:
```csharp
if (response.StopReason == "refusal" && response.StopDetails is { } details)
{
Console.WriteLine($"Category: {details.Category}");
Console.WriteLine($"Explanation: {details.Explanation}");
}
```
---
## Managed Agents (Beta)
The C# SDK supports Managed Agents via `client.Beta.Agents`, `client.Beta.Sessions`, `client.Beta.Environments`, and related namespaces. See `shared/managed-agents-overview.md` for the architecture and `curl/managed-agents.md` for the wire-level reference.

View File

@@ -0,0 +1,361 @@
# Claude API — C#
> **Note:** The C# SDK is the official Anthropic SDK for C#. Tool use is supported via the Messages API with a beta `BetaToolRunner` for automatic tool execution loops. The SDK also supports Microsoft.Extensions.AI IChatClient integration with function invocation and Managed Agents (beta).
## Namespace Reference
Types are organized by namespace. If a type you need isn't shown in an example below, locate it via this table first — don't block on fetching SDK source over the network.
| `using` | Contains |
|---|---|
| `Anthropic` | `AnthropicClient`, top-level options |
| `Anthropic.Models.Messages` | non-beta request/response types — `MessageCreateParams`, `Model`, `Role`, `ContentBlock`, `TextBlock`, `ToolUseBlock`, `ToolResultBlockParam`, `Tool*` (tool definition classes) |
| `Anthropic.Models.Beta.Messages` | beta-endpoint equivalents — `MessageCreateParams`, `BetaMessage`, `BetaTool*`, `Speed`, `BetaRequestMcpServerUrlDefinition`, context-editing/compaction configs |
| `Anthropic.Models.Beta` | shared beta constants |
| `Anthropic.Models.Beta.Files` | Files API types |
| `Anthropic.Models.Messages.Batches` | Batch API types |
| `Anthropic.Helpers.Beta` | `BetaToolRunner`, beta helper utilities |
| `Anthropic.Exceptions` | `AnthropicApiException`, `AnthropicRateLimitException`, `Anthropic5xxException`, etc. — see `shared/error-codes.md` |
| `Anthropic.Bedrock` / `Anthropic.Vertex` / `Anthropic.Foundry` / `Anthropic.Aws` | platform clients (separate NuGet packages): `AnthropicBedrockMantleClient`, `AnthropicFoundryClient`, `AnthropicAwsClient` |
`client.Messages.*` uses non-beta types; `client.Beta.Messages.*` uses the `Anthropic.Models.Beta.Messages` types. Both namespaces define a `MessageCreateParams` — pick the one matching the client path you call.
### Key types per feature
Write from this table instead of reflecting the SDK assembly. Endpoint column tells you whether to use `client.Messages.*` or `client.Beta.Messages.*`.
| Feature | Endpoint | Key C# types (namespace per table above) |
|---|---|---|
| User profiles | beta | `client.Beta.UserProfiles.Create(...)` / `.Retrieve(id)` / `.List()`. Pass the returned profile id on the beta messages call. Requires a beta header — check the SDK's beta-headers reference for the current flag. |
| Agent Skills | beta | `BetaContainerParams` (with `Skills = [new BetaSkillParams { ... }]`), `BetaCodeExecutionTool20250825`. `Betas = ["code-execution-2025-08-25", "skills-2025-10-02"]`. Download the output via `client.Beta.Files.Download(fileId)`. |
| Advisor tool | beta | `BetaAdvisorTool20260301` — may not be in all SDK releases yet |
| Cache diagnostics | beta | `Diagnostics = new() { PreviousMessageID = … }`, `BetaCacheControlEphemeral`, `BetaContentBlockParam` |
| Context editing | beta | `ContextManagement = new BetaContextManagementConfig { Edits = [new BetaClearToolUses20250919Edit()] }`. `Betas = ["context-management-2025-06-27"]` (not `compact-2026-01-12` — that's for `BetaCompact20260112Edit`). |
| Memory tool | non-beta | `Tools = [new ToolUnion(new MemoryTool20250818())]` |
| Programmatic tool calling | non-beta | `CodeExecutionTool20260120`, `ToolResultBlockParam`, `ContentBlockParam` |
| Task budgets | beta | `BetaOutputConfig` with `TaskBudget = new BetaTokenTaskBudget { ... }` |
| Tool search | non-beta | `new ToolUnion(new ToolSearchToolRegex20251119 { Type = ToolSearchToolRegex20251119Type.ToolSearchToolRegex20251119 })``Type` must be set explicitly. |
| Web search | non-beta | `new ToolUnion(new WebSearchTool20260209())` — the latest variant with dynamic filtering (Opus 4.8/4.7/4.6 + Sonnet 4.6). For older models or Vertex, use `WebSearchTool20250305()` |
### Discovering type and member names
If a type or member you need isn't in the tables above, `strings ~/.nuget/packages/anthropic/*/lib/*/Anthropic.dll | grep -i <term>` is fast and sufficient for locating class and property names. **Do not escalate to a `dotnet run` reflection probe** to dump members precisely — the first compile is slow enough to be backgrounded in many environments, trapping you in a polling loop. Instead, write `Program.cs` using the names `strings | grep` found; if a member name is wrong the compiler error (`error CS1061: 'X' does not contain a definition for 'Y'`) points at it in a few seconds, faster than any reflection probe.
Note that `strings` will not surface wire-format snake_case field names (`output_tokens`, `stop_reason`) — those are stored in the DLL differently. **C# properties are the PascalCase equivalent of the wire field** (`response.Usage.OutputTokens`, `response.StopReason`). If you know the wire field name from the docs, write the PascalCase property and compile; do not probe for the snake_case string.
### Minimal working skeleton
**Write a plain `Program.cs` body**`using` statements followed by top-level statements, as below. Do **not** add a `#!/usr/bin/env dotnet` shebang or `#:package Anthropic@*` directive: those are .NET file-based-app syntax and fail with `CS1024: Preprocessor directive expected` when the file is compiled via an existing `.csproj`. The standard project setup (per the [C# quickstart](https://docs.claude.com/en/docs/get-started): `dotnet new console``dotnet add package Anthropic` → edit `Program.cs``dotnet run`) provides the `.csproj` and package reference.
Start from this — it compiles as-is. Fill in the feature-specific fields; do not spend turns running reflection or XML-doc inspection to discover type names first.
```csharp
using System;
using Anthropic;
using Anthropic.Models.Messages; // or Anthropic.Models.Beta.Messages for beta endpoints
AnthropicClient client = new();
var message = await client.Messages.Create(new MessageCreateParams
{
Model = Model.ClaudeOpus4_8,
MaxTokens = 1024,
Messages = [ new() { Role = Role.User, Content = "Hello, Claude" } ],
});
Console.WriteLine(message);
```
For beta features (anything behind an `anthropic-beta` header), use the beta client path and namespace — same overall shape:
```csharp
using System;
using Anthropic;
using Anthropic.Models.Beta.Messages;
AnthropicClient client = new();
var response = await client.Beta.Messages.Create(new MessageCreateParams
{
Model = "claude-opus-4-8",
MaxTokens = 4096,
Betas = ["<beta-flag>"],
Messages = [ new() { Role = Role.User, Content = "…" } ],
// Tools = new BetaToolUnion[] { new BetaSomeTool { … } }, // for tool features
});
Console.WriteLine(response);
```
If a type name the feature needs isn't in this file, write it following the naming pattern in the Namespace Reference above and fix from compiler output — producing a `Program.cs` and iterating beats researching.
### Common C# compile errors
- **CS8803 (top-level statements must precede type declarations):** put any `record`/`class`/`struct` definitions **after** the last top-level statement, at the end of the file. A record defined above `var client = new AnthropicClient()` will not compile.
- **`await foreach` on a `Task<…Page>`:** `client.Models.List()` returns a `Task<ModelListPage>`, which is not directly async-enumerable. Await it first, then iterate: `var page = await client.Models.List(); foreach (var m in page.Items) {…}`. For auto-pagination, check whether the page type exposes `AutoPagingEachAsync()` or similar before reaching for `await foreach`.
## Installation
```bash
dotnet add package Anthropic
```
## Client Initialization
```csharp
using Anthropic;
// Default (uses ANTHROPIC_API_KEY env var)
AnthropicClient client = new();
// Explicit API key (use environment variables — never hardcode keys)
AnthropicClient client = new() {
ApiKey = Environment.GetEnvironmentVariable("ANTHROPIC_API_KEY")
};
```
---
## Basic Message Request
```csharp
using Anthropic.Models.Messages;
var parameters = new MessageCreateParams
{
Model = Model.ClaudeOpus4_8,
MaxTokens = 16000,
Messages = [new() { Role = Role.User, Content = "What is the capital of France?" }]
};
var response = await client.Messages.Create(parameters);
// ContentBlock is a union wrapper. .Value unwraps to the variant object,
// then OfType<T> filters to the type you want. Or use the TryPick* idiom
// shown in the Thinking section below.
foreach (var text in response.Content.Select(b => b.Value).OfType<TextBlock>())
{
Console.WriteLine(text.Text);
}
```
---
## Thinking
**Adaptive thinking is the recommended mode for Claude 4.6+ models.** Claude decides dynamically when and how much to think.
> **Fable 5, Opus 4.8, Opus 4.7, Opus 4.6, and Sonnet 4.6:** Use adaptive thinking (below). `new ThinkingConfigEnabled { BudgetTokens = N }` is removed on Fable 5, Opus 4.8, and 4.7 (400 if sent); deprecated on Opus 4.6 and Sonnet 4.6.
> **Older models:** Use `new ThinkingConfigEnabled { BudgetTokens = N }` (budget must be < `MaxTokens`, min 1024).
```csharp
using Anthropic.Models.Messages;
var response = await client.Messages.Create(new MessageCreateParams
{
Model = Model.ClaudeOpus4_8,
MaxTokens = 16000,
// ThinkingConfigParam? implicitly converts from the concrete variant classes —
// no wrapper needed.
// display opt-in: default is omitted (empty thinking text) on Fable 5 / Mythos 5 / Opus 4.8 / 4.7
Thinking = new ThinkingConfigAdaptive { Display = Display.Summarized },
Messages =
[
new() { Role = Role.User, Content = "Solve: 27 * 453" },
],
});
// ThinkingBlock(s) precede TextBlock in Content. TryPick* narrows the union.
foreach (var block in response.Content)
{
if (block.TryPickThinking(out ThinkingBlock? t))
{
Console.WriteLine($"[thinking] {t.Thinking}");
}
else if (block.TryPickText(out TextBlock? text))
{
Console.WriteLine(text.Text);
}
}
```
Alternative to `TryPick*`: `.Select(b => b.Value).OfType<ThinkingBlock>()` (same LINQ pattern as the Basic Message example).
---
## Context Editing / Compaction (Beta)
**Beta-namespace prefix is inconsistent** (source-verified against `src/Anthropic/Models/Beta/Messages/*.cs` @ 12.9.0). No prefix: `MessageCreateParams`, `MessageCountTokensParams`, `Role`, `Speed`. **Everything else has the `Beta` prefix**: `BetaMessageParam`, `BetaMessage`, `BetaContentBlock`, `BetaToolUseBlock`, all block param types. The unprefixed `Role` WILL collide with `Anthropic.Models.Messages.Role` if you import both namespaces (CS0104). Safest: import only Beta; if mixing, alias the beta `Role`:
```csharp
using Anthropic.Models.Beta.Messages;
using NonBeta = Anthropic.Models.Messages; // only if you also need non-beta types
// Now: MessageCreateParams, BetaMessageParam, Role (beta's), NonBeta.Role (if needed)
```
`BetaMessage.Content` is `IReadOnlyList<BetaContentBlock>` — a 15-variant discriminated union. Narrow with `TryPick*`. **Response `BetaContentBlock` is NOT assignable to param `BetaContentBlockParam`** — there's no `.ToParam()` in C#. Round-trip by converting each block:
```csharp
using Anthropic.Models.Beta.Messages;
var betaParams = new MessageCreateParams // no Beta prefix — see unprefixed list above
{
Model = Model.ClaudeOpus4_8,
MaxTokens = 16000,
Betas = ["compact-2026-01-12"],
ContextManagement = new BetaContextManagementConfig
{
Edits = [new BetaCompact20260112Edit()],
},
Messages = messages,
};
BetaMessage resp = await client.Beta.Messages.Create(betaParams);
foreach (BetaContentBlock block in resp.Content)
{
if (block.TryPickCompaction(out BetaCompactionBlock? compaction))
{
// Content is nullable — compaction can fail server-side
Console.WriteLine($"compaction summary: {compaction.Content}");
}
}
// Context-edit metadata lives on a separate nullable field
if (resp.ContextManagement is { } ctx)
{
foreach (var edit in ctx.AppliedEdits)
Console.WriteLine($"cleared {edit.ClearedInputTokens} tokens");
}
// ROUND-TRIP: BetaMessageParam.Content is BetaMessageParamContent (a string|list
// union). It implicit-converts from List<BetaContentBlockParam>, NOT from the
// response's IReadOnlyList<BetaContentBlock>. Convert each block:
List<BetaContentBlockParam> paramBlocks = [];
foreach (var b in resp.Content)
{
if (b.TryPickText(out var t)) paramBlocks.Add(new BetaTextBlockParam { Text = t.Text });
else if (b.TryPickCompaction(out var c)) paramBlocks.Add(new BetaCompactionBlockParam { Content = c.Content });
// ... other variants as needed
}
messages.Add(new BetaMessageParam { Role = Role.Assistant, Content = paramBlocks });
```
All 15 `BetaContentBlock.TryPick*` variants: `Text`, `Thinking`, `RedactedThinking`, `ToolUse`, `ServerToolUse`, `WebSearchToolResult`, `WebFetchToolResult`, `CodeExecutionToolResult`, `BashCodeExecutionToolResult`, `TextEditorCodeExecutionToolResult`, `ToolSearchToolResult`, `McpToolUse`, `McpToolResult`, `ContainerUpload`, `Compaction`.
**`BetaToolUseBlock.Input` is `IReadOnlyDictionary<string, JsonElement>`** — index by key then call the `JsonElement` extractor:
```csharp
if (block.TryPickToolUse(out BetaToolUseBlock? tu))
{
int a = tu.Input["a"].GetInt32();
string s = tu.Input["name"].GetString()!;
}
```
---
## Effort Parameter
Effort is nested under `OutputConfig`, NOT a top-level property. `ApiEnum<string, Effort>` has an implicit conversion from the enum, so assign `Effort.High` directly.
```csharp
OutputConfig = new OutputConfig { Effort = Effort.High },
```
Values: `Effort.Low`, `Effort.Medium`, `Effort.High`, `Effort.Max`. Combine with `Thinking = new ThinkingConfigAdaptive()` for cost-quality control.
---
## Prompt Caching
`System` takes `MessageCreateParamsSystem?` — a union of `string` or `List<TextBlockParam>`. There is no `SystemTextBlockParam`; use plain `TextBlockParam`. The implicit conversion needs the concrete `List<TextBlockParam>` type (array literals won't convert). For placement patterns and the silent-invalidator audit checklist, see `shared/prompt-caching.md`.
```csharp
System = new List<TextBlockParam> {
new() {
Text = longSystemPrompt,
CacheControl = new CacheControlEphemeral(), // auto-sets Type = "ephemeral"
},
},
```
Optional `Ttl` on `CacheControlEphemeral`: `new() { Ttl = Ttl.Ttl1h }` or `Ttl.Ttl5m`. `CacheControl` also exists on `Tool.CacheControl` and top-level `MessageCreateParams.CacheControl`.
Verify hits via `response.Usage.CacheCreationInputTokens` / `response.Usage.CacheReadInputTokens`.
---
## Token Counting
```csharp
MessageTokensCount result = await client.Messages.CountTokens(new MessageCountTokensParams {
Model = Model.ClaudeOpus4_8,
Messages = [new() { Role = Role.User, Content = "Hello" }],
});
long tokens = result.InputTokens;
```
`MessageCountTokensParams.Tools` uses a different union type (`MessageCountTokensTool`) than `MessageCreateParams.Tools` (`ToolUnion`) — if you're passing tools, the compiler will tell you when it matters.
---
## PDF / Document Input
`DocumentBlockParam` takes a `DocumentBlockParamSource` union: `Base64PdfSource` / `UrlPdfSource` / `PlainTextSource` / `ContentBlockSource`. `Base64PdfSource` auto-sets `MediaType = "application/pdf"` and `Type = "base64"`.
```csharp
new MessageParam {
Role = Role.User,
Content = new List<ContentBlockParam> {
new DocumentBlockParam { Source = new Base64PdfSource { Data = base64String } },
new TextBlockParam { Text = "Summarize this PDF" },
},
}
```
---
## Fast Mode (Beta)
```csharp
var response = await client.Beta.Messages.Create(new MessageCreateParams {
Model = "claude-opus-4-8", MaxTokens = 4096,
Speed = Speed.Fast,
Betas = ["fast-mode-2026-02-01"],
Messages = [...],
});
```
## Models API
```csharp
var page = await client.Models.List();
foreach (var m in page.Items) { Console.WriteLine(m.ID); }
var one = await client.Models.Retrieve("claude-opus-4-8");
```
## Long Output (128k) + Prefill
Set `MaxTokens = 128000` on `client.Messages` and use the streaming path (see `streaming.md`). On Claude 4+ models, 128k output is native — no `output-128k-*` beta header or beta namespace is needed.
**Prefilling the assistant message** (putting a trailing `Role.Assistant` message in the input) is **not supported** on Claude Fable 5, Opus 4.8, Opus 4.7, Opus 4.6, and Sonnet 4.6 — requests return a 400. Use `OutputConfig.Format` (structured outputs) instead.
## Stop Details
When `StopReason` is `"refusal"`, the response includes structured `StopDetails`:
```csharp
if (response.StopReason == "refusal" && response.StopDetails is { } details)
{
Console.WriteLine($"Category: {details.Category}");
Console.WriteLine($"Explanation: {details.Explanation}");
}
```
**Refusal fallbacks (Claude Fable 5) — opt in by default.** Fallbacks are opt-in: without them a refused request simply stops. In `claude-fable-5` code, include `Fallbacks = [new(Model.ClaudeOpus4_8)]` plus the `AnthropicBeta.ServerSideFallback2026_06_01` beta on the beta messages call by default — a policy decline is re-served by the fallback model inside the same call. Not available on Amazon Bedrock, Vertex AI, or Microsoft Foundry — use the client-side handler there: `new AnthropicClient { Handlers = [new BetaRefusalFallbackHandler { Fallbacks = [new(Model.ClaudeOpus4_8)] }] }` (namespace `Anthropic.Helpers`), with per-conversation state via `BetaFallbackState.Create()` scoped with `using (fallbackState.Use()) { ... }`. Full semantics (billing, sticky routing, streaming) and a runnable example: `shared/model-migration.md` → Migrating to Claude Fable 5 → `refusal` stop reason, and the C# SDK repo's `examples/` (WebFetch via `shared/live-sources.md`).
---
## Managed Agents (Beta)
The C# SDK supports Managed Agents via `client.Beta.Agents`, `client.Beta.Sessions`, `client.Beta.Environments`, and related namespaces. See `shared/managed-agents-overview.md` for the architecture and `curl/managed-agents.md` for the wire-level reference.

View File

@@ -0,0 +1,14 @@
# Message Batches — C#
## Message Batches API
```csharp
var batch = await client.Messages.Batches.Create(new() {
Requests = [
new() { CustomID = "req-1", Params = new() { Model = "claude-opus-4-8", MaxTokens = 1024, Messages = [...] } },
],
});
// Poll client.Messages.Batches.Retrieve(batch.ID) until ProcessingStatus == "ended",
// then iterate client.Messages.Batches.Results(batch.ID).
```

View File

@@ -0,0 +1,23 @@
# Files API — C#
## Files API (Beta)
Files live under `client.Beta.Files` (namespace `Anthropic.Models.Beta.Files`). `BinaryContent` implicit-converts from `Stream` and `byte[]`.
```csharp
using Anthropic.Models.Beta.Files;
using Anthropic.Models.Beta.Messages;
FileMetadata meta = await client.Beta.Files.Upload(
new FileUploadParams { File = File.OpenRead("doc.pdf") });
// Referencing the uploaded file requires Beta message types:
new BetaRequestDocumentBlock {
Source = new BetaFileDocumentSource { FileID = meta.ID },
}
```
The non-beta `DocumentBlockParamSource` union has no file-ID variant — file references need `client.Beta.Messages.Create()`.
---

View File

@@ -0,0 +1,28 @@
# Streaming — C#
## Streaming
```csharp
using Anthropic.Models.Messages;
var parameters = new MessageCreateParams
{
Model = Model.ClaudeOpus4_8,
MaxTokens = 64000,
Messages = [new() { Role = Role.User, Content = "Write a haiku" }]
};
await foreach (RawMessageStreamEvent streamEvent in client.Messages.CreateStreaming(parameters))
{
if (streamEvent.TryPickContentBlockDelta(out var delta) &&
delta.Delta.TryPickText(out var text))
{
Console.Write(text.Text);
}
}
```
**`RawMessageStreamEvent` TryPick methods** (naming drops the `Message`/`Raw` prefix): `TryPickStart`, `TryPickDelta`, `TryPickStop`, `TryPickContentBlockStart`, `TryPickContentBlockDelta`, `TryPickContentBlockStop`. There is no `TryPickMessageStop` — use `TryPickStop`.
---

View File

@@ -0,0 +1,164 @@
# Tool Use — C#
For conceptual overview (tool definitions, tool choice, tips), see [shared/tool-use-concepts.md](../../shared/tool-use-concepts.md).
## Tool Use
### Defining a tool
`Tool` (NOT `ToolParam`) with an `InputSchema` record. `InputSchema.Type` is auto-set to `"object"` by the constructor — don't set it. `ToolUnion` has an implicit conversion from `Tool`, triggered by the collection expression `[...]`.
```csharp
using System.Text.Json;
using Anthropic.Models.Messages;
var parameters = new MessageCreateParams
{
Model = Model.ClaudeSonnet4_6,
MaxTokens = 16000,
Tools = [
new Tool {
Name = "get_weather",
Description = "Get the current weather in a given location",
InputSchema = new() {
Properties = new Dictionary<string, JsonElement> {
["location"] = JsonSerializer.SerializeToElement(
new { type = "string", description = "City name" }),
},
Required = ["location"],
},
},
],
Messages = [new() { Role = Role.User, Content = "Weather in Paris?" }],
};
```
Derived from `anthropic-sdk-csharp/src/Anthropic/Models/Messages/Tool.cs` and `ToolUnion.cs:799` (implicit conversion).
See [shared tool use concepts](../../shared/tool-use-concepts.md) for the loop pattern.
### Converting response content to the follow-up assistant message
When echoing Claude's response back in the assistant turn, **there is no `.ToParam()` helper** — manually reconstruct each `ContentBlock` variant as its `*Param` counterpart. Do NOT use `new ContentBlockParam(block.Json)`: it compiles and serializes, but `.Value` stays `null` so `TryPick*`/`Validate()` fail (degraded JSON pass-through, not the typed path).
```csharp
using Anthropic.Models.Messages;
Message response = await client.Messages.Create(parameters);
// No .ToParam() — reconstruct per variant. Implicit conversions from each
// *Param type to ContentBlockParam mean no explicit wrapper.
List<ContentBlockParam> assistantContent = [];
List<ContentBlockParam> toolResults = [];
foreach (ContentBlock block in response.Content)
{
if (block.TryPickText(out TextBlock? text))
{
assistantContent.Add(new TextBlockParam { Text = text.Text });
}
else if (block.TryPickThinking(out ThinkingBlock? thinking))
{
// Signature MUST be preserved — the API rejects tampering
assistantContent.Add(new ThinkingBlockParam
{
Thinking = thinking.Thinking,
Signature = thinking.Signature,
});
}
else if (block.TryPickRedactedThinking(out RedactedThinkingBlock? redacted))
{
assistantContent.Add(new RedactedThinkingBlockParam { Data = redacted.Data });
}
else if (block.TryPickToolUse(out ToolUseBlock? toolUse))
{
// ToolUseBlock has required Caller; ToolUseBlockParam.Caller is optional — don't copy it
assistantContent.Add(new ToolUseBlockParam
{
ID = toolUse.ID,
Name = toolUse.Name,
Input = toolUse.Input,
});
// Execute the tool; collect ONE result per tool_use block — the API
// rejects the follow-up if any tool_use ID lacks a matching tool_result.
string result = ExecuteYourTool(toolUse.Name, toolUse.Input);
toolResults.Add(new ToolResultBlockParam
{
ToolUseID = toolUse.ID,
Content = result,
});
}
}
// Follow-up: prior messages + assistant echo + user tool_result(s)
List<MessageParam> followUpMessages =
[
.. parameters.Messages,
new() { Role = Role.Assistant, Content = assistantContent },
new() { Role = Role.User, Content = toolResults },
];
```
`ToolResultBlockParam` has no tuple constructor — use the object initializer. `Content` is a string-or-list union; a plain `string` implicitly converts.
---
## Structured Output
```csharp
OutputConfig = new OutputConfig {
Format = new JsonOutputFormat {
Schema = new Dictionary<string, JsonElement> {
["type"] = JsonSerializer.SerializeToElement("object"),
["properties"] = JsonSerializer.SerializeToElement(
new { name = new { type = "string" } }),
["required"] = JsonSerializer.SerializeToElement(new[] { "name" }),
},
},
},
```
`JsonOutputFormat.Type` is auto-set to `"json_schema"` by the constructor. `Schema` is `required`.
---
## Anthropic-Defined Tools
Web search, bash, text editor, and code execution are Anthropic-defined tools with built-in schemas. Web search and code execution are server-executed; bash and text editor are client-executed (you handle the `tool_use` locally — see `shared/tool-use-concepts.md`). Type names are version-suffixed; constructors auto-set `name`/`type`. **Wrap each in `new ToolUnion(...)` explicitly.**
```csharp
Tools = [
new ToolUnion(new WebSearchTool20260209()),
new ToolUnion(new ToolBash20250124()),
new ToolUnion(new ToolTextEditor20250728()),
new ToolUnion(new CodeExecutionTool20260120()),
],
```
Also available: `new ToolUnion(new WebFetchTool20260209())`, `new ToolUnion(new MemoryTool20250818())`. `WebSearchTool20260209` optionals: `AllowedDomains`, `BlockedDomains`, `MaxUses`, `UserLocation`.
---
## Tool Runner (Beta)
The C# SDK provides a `BetaToolRunner` for automatic tool execution loops. Define tools with raw JSON schemas, and the runner handles the API call → tool execution → result feedback loop.
```csharp
using Anthropic.Models.Beta.Messages;
// Define tools and create params as shown in the Tool Use section above,
// but using the beta namespace types (BetaToolUnion, etc.)
var runner = client.Beta.Messages.ToolRunner(betaParams);
await foreach (BetaMessage message in runner)
{
foreach (var block in message.Content)
{
if (block.TryPickText(out var text))
{
Console.WriteLine(text.Text);
}
}
}
```
---

View File

@@ -195,7 +195,8 @@ curl https://api.anthropic.com/v1/messages \
"model": "claude-opus-4-8",
"max_tokens": 16000,
"thinking": {
"type": "adaptive"
"type": "adaptive",
"display": "summarized"
},
"output_config": {
"effort": "high"
@@ -206,6 +207,44 @@ curl https://api.anthropic.com/v1/messages \
---
## Refusal Fallbacks (Claude Fable 5) — opt in by default
On `claude-fable-5`, safety classifiers may decline a request (HTTP 200 with `stop_reason: "refusal"`). Fallbacks are **opt-in**: without them the request simply stops. Include the `fallbacks` parameter and its beta header by default — on a policy decline the API re-runs the same request on the fallback model inside the same call. A decline before any output isn't billed (a mid-stream decline bills the streamed partial); the rescue bills at the fallback model's own rates.
```bash
response=$(curl -s https://api.anthropic.com/v1/messages \
-H "Content-Type: application/json" \
-H "x-api-key: $ANTHROPIC_API_KEY" \
-H "anthropic-version: 2023-06-01" \
-H "anthropic-beta: server-side-fallback-2026-06-01" \
-d '{
"model": "claude-fable-5",
"max_tokens": 16000,
"fallbacks": [{"model": "claude-opus-4-8"}],
"messages": [{"role": "user", "content": "Hello"}]
}')
# Which model produced the message
echo "$response" | jq -r '.model'
# Refusal on the final response means the whole chain refused
echo "$response" | jq -r '.stop_reason'
# Switch points: one fallback block per model that ran and declined this turn
echo "$response" | jq -r '.content[] | select(.type == "fallback") | "\(.from.model) declined; \(.to.model) continued"'
# Served-by signal — covers sticky turns, which carry no fallback block.
# Pair with stop_reason: the fallback model can itself refuse.
if [ "$(echo "$response" | jq -r '.stop_reason')" != "refusal" ] && \
echo "$response" | jq -e '[.usage.iterations[]? | select(.type == "fallback_message")] | length > 0' > /dev/null; then
echo "fallback model served this turn"
fi
```
The header must be exactly `server-side-fallback-2026-06-01`. The parameter is rejected on the Batches API and unavailable on Amazon Bedrock, Vertex AI, and Microsoft Foundry. Full semantics (sticky routing, billing, streaming, echoing fallback turns back): `shared/model-migration.md` → Migrating to Claude Fable 5 → `refusal` stop reason.
---
## Required Headers
| Header | Value | Description |

View File

@@ -1,440 +0,0 @@
# Claude API — Go
> **Note:** The Go SDK supports the Claude API and beta tool use with `BetaToolRunner`. Agent SDK is not yet available for Go.
## Installation
```bash
go get github.com/anthropics/anthropic-sdk-go
```
## Client Initialization
```go
import (
"github.com/anthropics/anthropic-sdk-go"
"github.com/anthropics/anthropic-sdk-go/option"
)
// Default (uses ANTHROPIC_API_KEY env var)
client := anthropic.NewClient()
// Explicit API key
client := anthropic.NewClient(
option.WithAPIKey("your-api-key"),
)
```
---
## Model Constants
The Go SDK provides typed model constants: `anthropic.ModelClaudeFable5`, `anthropic.ModelClaudeOpus4_8`, `anthropic.ModelClaudeOpus4_7`, `anthropic.ModelClaudeSonnet4_6`, `anthropic.ModelClaudeHaiku4_5_20251001`. Use `ModelClaudeOpus4_8` unless the user specifies otherwise; if they ask for Fable or the most powerful model, use `anthropic.ModelClaudeFable5` (see `shared/models.md` for the full resolution table).
---
## Basic Message Request
```go
response, err := client.Messages.New(context.Background(), anthropic.MessageNewParams{
Model: anthropic.ModelClaudeOpus4_8,
MaxTokens: 16000,
Messages: []anthropic.MessageParam{
anthropic.NewUserMessage(anthropic.NewTextBlock("What is the capital of France?")),
},
})
if err != nil {
log.Fatal(err)
}
for _, block := range response.Content {
switch variant := block.AsAny().(type) {
case anthropic.TextBlock:
fmt.Println(variant.Text)
}
}
```
---
## Streaming
```go
stream := client.Messages.NewStreaming(context.Background(), anthropic.MessageNewParams{
Model: anthropic.ModelClaudeOpus4_6,
MaxTokens: 64000,
Messages: []anthropic.MessageParam{
anthropic.NewUserMessage(anthropic.NewTextBlock("Write a haiku")),
},
})
for stream.Next() {
event := stream.Current()
switch eventVariant := event.AsAny().(type) {
case anthropic.ContentBlockDeltaEvent:
switch deltaVariant := eventVariant.Delta.AsAny().(type) {
case anthropic.TextDelta:
fmt.Print(deltaVariant.Text)
}
}
}
if err := stream.Err(); err != nil {
log.Fatal(err)
}
```
**Accumulating the final message** (there is no `GetFinalMessage()` on the stream):
```go
stream := client.Messages.NewStreaming(ctx, params)
message := anthropic.Message{}
for stream.Next() {
message.Accumulate(stream.Current())
}
if err := stream.Err(); err != nil { log.Fatal(err) }
// message.Content now has the complete response
```
---
## Tool Use
### Tool Runner (Beta — Recommended)
**Beta:** The Go SDK provides `BetaToolRunner` for automatic tool use loops via the `toolrunner` package.
```go
import (
"context"
"fmt"
"log"
"github.com/anthropics/anthropic-sdk-go"
"github.com/anthropics/anthropic-sdk-go/toolrunner"
)
// Define tool input with jsonschema tags for automatic schema generation
type GetWeatherInput struct {
City string `json:"city" jsonschema:"required,description=The city name"`
}
// Create a tool with automatic schema generation from struct tags
weatherTool, err := toolrunner.NewBetaToolFromJSONSchema(
"get_weather",
"Get current weather for a city",
func(ctx context.Context, input GetWeatherInput) (anthropic.BetaToolResultBlockParamContentUnion, error) {
return anthropic.BetaToolResultBlockParamContentUnion{
OfText: &anthropic.BetaTextBlockParam{
Text: fmt.Sprintf("The weather in %s is sunny, 72°F", input.City),
},
}, nil
},
)
if err != nil {
log.Fatal(err)
}
// Create a tool runner that handles the conversation loop automatically
runner := client.Beta.Messages.NewToolRunner(
[]anthropic.BetaTool{weatherTool},
anthropic.BetaToolRunnerParams{
BetaMessageNewParams: anthropic.BetaMessageNewParams{
Model: anthropic.ModelClaudeOpus4_6,
MaxTokens: 16000,
Messages: []anthropic.BetaMessageParam{
anthropic.NewBetaUserMessage(anthropic.NewBetaTextBlock("What's the weather in Paris?")),
},
},
MaxIterations: 5,
},
)
// Run until Claude produces a final response
message, err := runner.RunToCompletion(context.Background())
if err != nil {
log.Fatal(err)
}
// RunToCompletion returns *BetaMessage; content is []BetaContentBlockUnion.
// Narrow via AsAny() switch — note the Beta-namespace types (BetaTextBlock,
// not TextBlock):
for _, block := range message.Content {
switch block := block.AsAny().(type) {
case anthropic.BetaTextBlock:
fmt.Println(block.Text)
}
}
```
**Key features of the Go tool runner:**
- Automatic schema generation from Go structs via `jsonschema` tags
- `RunToCompletion()` for simple one-shot usage
- `All()` iterator for processing each message in the conversation
- `NextMessage()` for step-by-step iteration
- Streaming variant via `NewToolRunnerStreaming()` with `AllStreaming()`
### Manual Loop
For fine-grained control over the agentic loop, define tools with `ToolParam`, check `StopReason`, execute tools yourself, and feed `tool_result` blocks back. This is the pattern when you need to intercept, validate, or log tool calls.
Derived from `anthropic-sdk-go/examples/tools/main.go`.
```go
package main
import (
"context"
"encoding/json"
"fmt"
"log"
"github.com/anthropics/anthropic-sdk-go"
)
func main() {
client := anthropic.NewClient()
// 1. Define tools. ToolParam.InputSchema uses a map, no struct tags needed.
addTool := anthropic.ToolParam{
Name: "add",
Description: anthropic.String("Add two integers"),
InputSchema: anthropic.ToolInputSchemaParam{
Properties: map[string]any{
"a": map[string]any{"type": "integer"},
"b": map[string]any{"type": "integer"},
},
},
}
// ToolParam must be wrapped in ToolUnionParam for the Tools slice
tools := []anthropic.ToolUnionParam{{OfTool: &addTool}}
messages := []anthropic.MessageParam{
anthropic.NewUserMessage(anthropic.NewTextBlock("What is 2 + 3?")),
}
for {
resp, err := client.Messages.New(context.Background(), anthropic.MessageNewParams{
Model: anthropic.ModelClaudeSonnet4_6,
MaxTokens: 16000,
Messages: messages,
Tools: tools,
})
if err != nil {
log.Fatal(err)
}
// 2. Append the assistant response to history BEFORE processing tool calls.
// resp.ToParam() converts Message → MessageParam in one call.
messages = append(messages, resp.ToParam())
// 3. Walk content blocks. ContentBlockUnion is a flattened struct;
// use block.AsAny().(type) to switch on the actual variant.
toolResults := []anthropic.ContentBlockParamUnion{}
for _, block := range resp.Content {
switch variant := block.AsAny().(type) {
case anthropic.TextBlock:
fmt.Println(variant.Text)
case anthropic.ToolUseBlock:
// 4. Parse the tool input. Use variant.JSON.Input.Raw() to get the
// raw JSON — block.Input is json.RawMessage, not the parsed value.
var in struct {
A int `json:"a"`
B int `json:"b"`
}
if err := json.Unmarshal([]byte(variant.JSON.Input.Raw()), &in); err != nil {
log.Fatal(err)
}
result := fmt.Sprintf("%d", in.A+in.B)
// 5. NewToolResultBlock(toolUseID, content, isError) builds the
// ContentBlockParamUnion for you. block.ID is the tool_use_id.
toolResults = append(toolResults,
anthropic.NewToolResultBlock(block.ID, result, false))
}
}
// 6. Exit when Claude stops asking for tools
if resp.StopReason != anthropic.StopReasonToolUse {
break
}
// 7. Tool results go in a user message (variadic: all results in one turn)
messages = append(messages, anthropic.NewUserMessage(toolResults...))
}
}
```
**Key API surface:**
| Symbol | Purpose |
|---|---|
| `resp.ToParam()` | Convert `Message` response → `MessageParam` for history |
| `block.AsAny().(type)` | Type-switch on `ContentBlockUnion` variants |
| `variant.JSON.Input.Raw()` | Raw JSON string of tool input (for `json.Unmarshal`) |
| `anthropic.NewToolResultBlock(id, content, isError)` | Build `tool_result` block |
| `anthropic.NewUserMessage(blocks...)` | Wrap tool results as a user turn |
| `anthropic.StopReasonToolUse` | `StopReason` constant to check loop termination |
| `anthropic.ToolUnionParam{OfTool: &t}` | Wrap `ToolParam` in the union for `Tools:` |
---
## Thinking
Enable Claude's internal reasoning by setting `Thinking` in `MessageNewParams`. The response will contain `ThinkingBlock` content before the final `TextBlock`.
**Adaptive thinking is the recommended mode for Claude 4.6+ models.** Claude decides dynamically when and how much to think. Combine with the `effort` parameter for cost-quality control.
Derived from `anthropic-sdk-go/message.go` (`ThinkingConfigParamUnion`, `ThinkingConfigAdaptiveParam`).
```go
// There is no ThinkingConfigParamOfAdaptive helper — construct the union
// struct-literal directly and take the address of the variant.
adaptive := anthropic.ThinkingConfigAdaptiveParam{}
params := anthropic.MessageNewParams{
Model: anthropic.ModelClaudeSonnet4_6,
MaxTokens: 16000,
Thinking: anthropic.ThinkingConfigParamUnion{OfAdaptive: &adaptive},
Messages: []anthropic.MessageParam{
anthropic.NewUserMessage(anthropic.NewTextBlock("How many r's in strawberry?")),
},
}
resp, err := client.Messages.New(context.Background(), params)
if err != nil {
log.Fatal(err)
}
// ThinkingBlock(s) precede TextBlock in content
for _, block := range resp.Content {
switch b := block.AsAny().(type) {
case anthropic.ThinkingBlock:
fmt.Println("[thinking]", b.Thinking)
case anthropic.TextBlock:
fmt.Println(b.Text)
}
}
```
> **Deprecated:** `ThinkingConfigParamOfEnabled(budgetTokens)` (fixed-budget extended thinking) still works on Claude 4.6 but is deprecated. Use adaptive thinking above.
To disable: `anthropic.ThinkingConfigParamUnion{OfDisabled: &anthropic.ThinkingConfigDisabledParam{}}`.
---
## Prompt Caching
`System` is `[]TextBlockParam`; set `CacheControl` on the last block to cache tools + system together. For placement patterns and the silent-invalidator audit checklist, see `shared/prompt-caching.md`.
```go
System: []anthropic.TextBlockParam{{
Text: longSystemPrompt,
CacheControl: anthropic.NewCacheControlEphemeralParam(), // default 5m TTL
}},
```
For 1-hour TTL: `anthropic.CacheControlEphemeralParam{TTL: anthropic.CacheControlEphemeralTTLTTL1h}`. There's also a top-level `CacheControl` on `MessageNewParams` that auto-places on the last cacheable block.
Verify hits via `resp.Usage.CacheCreationInputTokens` / `resp.Usage.CacheReadInputTokens`.
---
## Server-Side Tools
Version-suffixed struct names with `Param` suffix. `Name`/`Type` are `constant.*` types — zero value marshals correctly, so `{}` works. Wrap in `ToolUnionParam` with the matching `Of*` field.
```go
Tools: []anthropic.ToolUnionParam{
{OfWebSearchTool20260209: &anthropic.WebSearchTool20260209Param{}},
{OfBashTool20250124: &anthropic.ToolBash20250124Param{}},
{OfTextEditor20250728: &anthropic.ToolTextEditor20250728Param{}},
{OfCodeExecutionTool20260120: &anthropic.CodeExecutionTool20260120Param{}},
},
```
Also available: `WebFetchTool20260209Param`, `MemoryTool20250818Param`, `ToolSearchToolBm25_20251119Param`, `ToolSearchToolRegex20251119Param`. For the advisor tool, use `BetaAdvisorTool20260301Param` in the beta namespace.
---
## Stop Details
When `StopReason` is `anthropic.StopReasonRefusal`, the response includes structured `StopDetails`:
```go
if resp.StopReason == anthropic.StopReasonRefusal {
fmt.Println("Category:", resp.StopDetails.Category) // "cyber" | "bio" | ""
fmt.Println("Explanation:", resp.StopDetails.Explanation)
}
```
---
## PDF / Document Input
`NewDocumentBlock` generic helper accepts any source type. `MediaType`/`Type` are auto-set.
```go
b64 := base64.StdEncoding.EncodeToString(pdfBytes)
msg := anthropic.NewUserMessage(
anthropic.NewDocumentBlock(anthropic.Base64PDFSourceParam{Data: b64}),
anthropic.NewTextBlock("Summarize this document"),
)
```
Other sources: `URLPDFSourceParam{URL: "https://..."}`, `PlainTextSourceParam{Data: "..."}`.
---
## Files API (Beta)
Under `client.Beta.Files`. Method is **`Upload`** (NOT `New`/`Create`), params struct is `BetaFileUploadParams`. The `File` field takes an `io.Reader`; use `anthropic.File()` to attach a filename + content-type for the multipart encoding.
```go
f, _ := os.Open("./upload_me.txt")
defer f.Close()
meta, err := client.Beta.Files.Upload(ctx, anthropic.BetaFileUploadParams{
File: anthropic.File(f, "upload_me.txt", "text/plain"),
Betas: []anthropic.AnthropicBeta{anthropic.AnthropicBetaFilesAPI2025_04_14},
})
// meta.ID is the file_id to reference in subsequent message requests
```
Other `Beta.Files` methods: `List`, `Delete`, `Download`, `GetMetadata`.
---
## Context Editing / Compaction (Beta)
Use `Beta.Messages.New` with `ContextManagement` on `BetaMessageNewParams`. There is no `NewBetaAssistantMessage` — use `.ToParam()` for the round-trip.
```go
params := anthropic.BetaMessageNewParams{
Model: anthropic.ModelClaudeOpus4_6, // also supported: ModelClaudeSonnet4_6
MaxTokens: 16000,
Betas: []anthropic.AnthropicBeta{"compact-2026-01-12"},
ContextManagement: anthropic.BetaContextManagementConfigParam{
Edits: []anthropic.BetaContextManagementConfigEditUnionParam{
{OfCompact20260112: &anthropic.BetaCompact20260112EditParam{}},
},
},
Messages: []anthropic.BetaMessageParam{ /* ... */ },
}
resp, err := client.Beta.Messages.New(ctx, params)
if err != nil {
log.Fatal(err)
}
// Round-trip: append response to history via .ToParam()
params.Messages = append(params.Messages, resp.ToParam())
// Read compaction blocks from the response
for _, block := range resp.Content {
if c, ok := block.AsAny().(anthropic.BetaCompactionBlock); ok {
fmt.Println("compaction summary:", c.Content)
}
}
```
Other edit types: `BetaClearToolUses20250919EditParam`, `BetaClearThinking20251015EditParam`.

View File

@@ -0,0 +1,185 @@
# Claude API — Go
> **Note:** The Go SDK supports the Claude API and beta tool use with `BetaToolRunner`. Agent SDK is not yet available for Go.
## Installation
```bash
go get github.com/anthropics/anthropic-sdk-go
```
## Client Initialization
```go
import (
"github.com/anthropics/anthropic-sdk-go"
"github.com/anthropics/anthropic-sdk-go/option"
)
// Default (uses ANTHROPIC_API_KEY env var)
client := anthropic.NewClient()
// Explicit API key
client := anthropic.NewClient(
option.WithAPIKey("your-api-key"),
)
```
---
## Model Constants
The Go SDK provides typed model constants: `anthropic.ModelClaudeFable5`, `anthropic.ModelClaudeOpus4_8`, `anthropic.ModelClaudeOpus4_7`, `anthropic.ModelClaudeSonnet4_6`, `anthropic.ModelClaudeHaiku4_5_20251001`. Use `ModelClaudeOpus4_8` unless the user specifies otherwise; if they ask for Fable or the most powerful model, use `anthropic.ModelClaudeFable5` (see `shared/models.md` for the full resolution table).
---
## Basic Message Request
```go
response, err := client.Messages.New(context.Background(), anthropic.MessageNewParams{
Model: anthropic.ModelClaudeOpus4_8,
MaxTokens: 16000,
Messages: []anthropic.MessageParam{
anthropic.NewUserMessage(anthropic.NewTextBlock("What is the capital of France?")),
},
})
if err != nil {
log.Fatal(err)
}
for _, block := range response.Content {
switch variant := block.AsAny().(type) {
case anthropic.TextBlock:
fmt.Println(variant.Text)
}
}
```
---
## Thinking
Enable Claude's internal reasoning by setting `Thinking` in `MessageNewParams`. The response will contain `ThinkingBlock` content before the final `TextBlock`.
**Adaptive thinking is the recommended mode for Claude 4.6+ models.** Claude decides dynamically when and how much to think. Combine with the `effort` parameter for cost-quality control.
Derived from `anthropic-sdk-go/message.go` (`ThinkingConfigParamUnion`, `ThinkingConfigAdaptiveParam`).
```go
// There is no ThinkingConfigParamOfAdaptive helper — construct the union
// struct-literal directly and take the address of the variant.
adaptive := anthropic.ThinkingConfigAdaptiveParam{}
params := anthropic.MessageNewParams{
Model: anthropic.ModelClaudeSonnet4_6,
MaxTokens: 16000,
Thinking: anthropic.ThinkingConfigParamUnion{OfAdaptive: &adaptive},
Messages: []anthropic.MessageParam{
anthropic.NewUserMessage(anthropic.NewTextBlock("How many r's in strawberry?")),
},
}
resp, err := client.Messages.New(context.Background(), params)
if err != nil {
log.Fatal(err)
}
// ThinkingBlock(s) precede TextBlock in content
for _, block := range resp.Content {
switch b := block.AsAny().(type) {
case anthropic.ThinkingBlock:
fmt.Println("[thinking]", b.Thinking)
case anthropic.TextBlock:
fmt.Println(b.Text)
}
}
```
> **Fable 5, Opus 4.8, Opus 4.7, Opus 4.6, and Sonnet 4.6:** Use adaptive thinking (above). `ThinkingConfigParamOfEnabled(budgetTokens)` is removed on Fable 5, Opus 4.8, and 4.7 (400 if sent); deprecated on Opus 4.6 and Sonnet 4.6.
> **Older models:** Use `anthropic.ThinkingConfigParamOfEnabled(N)` (budget must be < `MaxTokens`, min 1024).
To disable: `anthropic.ThinkingConfigParamUnion{OfDisabled: &anthropic.ThinkingConfigDisabledParam{}}`.
---
## Prompt Caching
`System` is `[]TextBlockParam`; set `CacheControl` on the last block to cache tools + system together. For placement patterns and the silent-invalidator audit checklist, see `shared/prompt-caching.md`.
```go
System: []anthropic.TextBlockParam{{
Text: longSystemPrompt,
CacheControl: anthropic.NewCacheControlEphemeralParam(), // default 5m TTL
}},
```
For 1-hour TTL: `anthropic.CacheControlEphemeralParam{TTL: anthropic.CacheControlEphemeralTTLTTL1h}`. There's also a top-level `CacheControl` on `MessageNewParams` that auto-places on the last cacheable block.
Verify hits via `resp.Usage.CacheCreationInputTokens` / `resp.Usage.CacheReadInputTokens`.
---
## Stop Details
When `StopReason` is `anthropic.StopReasonRefusal`, the response includes structured `StopDetails`:
```go
if resp.StopReason == anthropic.StopReasonRefusal {
fmt.Println("Category:", resp.StopDetails.Category) // e.g. "cyber", "bio", "reasoning_extraction", "frontier_llm", or "" — see docs for the full set
fmt.Println("Explanation:", resp.StopDetails.Explanation)
}
```
**Refusal fallbacks (Claude Fable 5) — opt in by default.** Fallbacks are opt-in: without them a refused request simply stops. In `claude-fable-5` code, include `Fallbacks: []anthropic.BetaFallbackParam{{Model: "claude-opus-4-8"}}` plus the `anthropic.AnthropicBetaServerSideFallback2026_06_01` beta on `client.Beta.Messages.New` by default — a policy decline is re-served by the fallback model inside the same call. Not available on Amazon Bedrock, Vertex AI, or Microsoft Foundry — register the client-side middleware there: `option.WithMiddleware(betafallback.BetaRefusalFallbackMiddleware(...))` from `lib/betafallback`, with per-conversation state via `betafallback.WithBetaFallbackState(&betafallback.BetaFallbackState{})`. Full semantics (billing, sticky routing, streaming) and a runnable example: `shared/model-migration.md` → Migrating to Claude Fable 5 → `refusal` stop reason, and the Go SDK repo's `examples/` (WebFetch via `shared/live-sources.md`).
---
## PDF / Document Input
`NewDocumentBlock` generic helper accepts any source type. `MediaType`/`Type` are auto-set.
```go
b64 := base64.StdEncoding.EncodeToString(pdfBytes)
msg := anthropic.NewUserMessage(
anthropic.NewDocumentBlock(anthropic.Base64PDFSourceParam{Data: b64}),
anthropic.NewTextBlock("Summarize this document"),
)
```
Other sources: `URLPDFSourceParam{URL: "https://..."}`, `PlainTextSourceParam{Data: "..."}`.
---
## Context Editing / Compaction (Beta)
Use `Beta.Messages.New` with `ContextManagement` on `BetaMessageNewParams`. There is no `NewBetaAssistantMessage` — use `.ToParam()` for the round-trip.
```go
params := anthropic.BetaMessageNewParams{
Model: anthropic.ModelClaudeOpus4_8, // also supported: ModelClaudeSonnet4_6
MaxTokens: 16000,
Betas: []anthropic.AnthropicBeta{"compact-2026-01-12"},
ContextManagement: anthropic.BetaContextManagementConfigParam{
Edits: []anthropic.BetaContextManagementConfigEditUnionParam{
{OfCompact20260112: &anthropic.BetaCompact20260112EditParam{}},
},
},
Messages: []anthropic.BetaMessageParam{ /* ... */ },
}
resp, err := client.Beta.Messages.New(ctx, params)
if err != nil {
log.Fatal(err)
}
// Round-trip: append response to history via .ToParam()
params.Messages = append(params.Messages, resp.ToParam())
// Read compaction blocks from the response
for _, block := range resp.Content {
if c, ok := block.AsAny().(anthropic.BetaCompactionBlock); ok {
fmt.Println("compaction summary:", c.Content)
}
}
```
Other edit types: `BetaClearToolUses20250919EditParam`, `BetaClearThinking20251015EditParam` — these need `Betas: []anthropic.AnthropicBeta{"context-management-2025-06-27"}`, not `compact-2026-01-12`.

View File

@@ -0,0 +1,21 @@
# Files API — Go
## Files API (Beta)
Under `client.Beta.Files`. Method is **`Upload`** (NOT `New`/`Create`), params struct is `BetaFileUploadParams`. The `File` field takes an `io.Reader`; use `anthropic.File()` to attach a filename + content-type for the multipart encoding.
```go
f, _ := os.Open("./upload_me.txt")
defer f.Close()
meta, err := client.Beta.Files.Upload(ctx, anthropic.BetaFileUploadParams{
File: anthropic.File(f, "upload_me.txt", "text/plain"),
Betas: []anthropic.AnthropicBeta{anthropic.AnthropicBetaFilesAPI2025_04_14},
})
// meta.ID is the file_id to reference in subsequent message requests
```
Other `Beta.Files` methods: `List`, `Delete`, `Download`, `GetMetadata`.
---

View File

@@ -0,0 +1,43 @@
# Streaming — Go
## Streaming
```go
stream := client.Messages.NewStreaming(context.Background(), anthropic.MessageNewParams{
Model: anthropic.ModelClaudeOpus4_8,
MaxTokens: 64000,
Messages: []anthropic.MessageParam{
anthropic.NewUserMessage(anthropic.NewTextBlock("Write a haiku")),
},
})
for stream.Next() {
event := stream.Current()
switch eventVariant := event.AsAny().(type) {
case anthropic.ContentBlockDeltaEvent:
switch deltaVariant := eventVariant.Delta.AsAny().(type) {
case anthropic.TextDelta:
fmt.Print(deltaVariant.Text)
}
}
}
if err := stream.Err(); err != nil {
log.Fatal(err)
}
```
**Accumulating the final message** (there is no `GetFinalMessage()` on the stream):
```go
stream := client.Messages.NewStreaming(ctx, params)
message := anthropic.Message{}
for stream.Next() {
message.Accumulate(stream.Current())
}
if err := stream.Err(); err != nil { log.Fatal(err) }
// message.Content now has the complete response
```
---

View File

@@ -0,0 +1,220 @@
# Tool Use — Go
For conceptual overview (tool definitions, tool choice, tips), see [shared/tool-use-concepts.md](../../shared/tool-use-concepts.md).
## Tool Use
### Tool Runner (Beta — Recommended)
**Beta:** The Go SDK provides `BetaToolRunner` for automatic tool use loops via the `toolrunner` package.
```go
import (
"context"
"fmt"
"log"
"github.com/anthropics/anthropic-sdk-go"
"github.com/anthropics/anthropic-sdk-go/toolrunner"
)
// Define tool input with jsonschema tags for automatic schema generation
type GetWeatherInput struct {
City string `json:"city" jsonschema:"required,description=The city name"`
}
// Create a tool with automatic schema generation from struct tags
weatherTool, err := toolrunner.NewBetaToolFromJSONSchema(
"get_weather",
"Get current weather for a city",
func(ctx context.Context, input GetWeatherInput) (anthropic.BetaToolResultBlockParamContentUnion, error) {
return anthropic.BetaToolResultBlockParamContentUnion{
OfText: &anthropic.BetaTextBlockParam{
Text: fmt.Sprintf("The weather in %s is sunny, 72°F", input.City),
},
}, nil
},
)
if err != nil {
log.Fatal(err)
}
// Create a tool runner that handles the conversation loop automatically
runner := client.Beta.Messages.NewToolRunner(
[]anthropic.BetaTool{weatherTool},
anthropic.BetaToolRunnerParams{
BetaMessageNewParams: anthropic.BetaMessageNewParams{
Model: anthropic.ModelClaudeOpus4_8,
MaxTokens: 16000,
Messages: []anthropic.BetaMessageParam{
anthropic.NewBetaUserMessage(anthropic.NewBetaTextBlock("What's the weather in Paris?")),
},
},
MaxIterations: 5,
},
)
// Run until Claude produces a final response
message, err := runner.RunToCompletion(context.Background())
if err != nil {
log.Fatal(err)
}
// RunToCompletion returns *BetaMessage; content is []BetaContentBlockUnion.
// Narrow via AsAny() switch — note the Beta-namespace types (BetaTextBlock,
// not TextBlock):
for _, block := range message.Content {
switch block := block.AsAny().(type) {
case anthropic.BetaTextBlock:
fmt.Println(block.Text)
}
}
```
**Key features of the Go tool runner:**
- Automatic schema generation from Go structs via `jsonschema` tags
- `RunToCompletion()` for simple one-shot usage
- `All()` iterator for processing each message in the conversation
- `NextMessage()` for step-by-step iteration
- Streaming variant via `NewToolRunnerStreaming()` with `AllStreaming()`
### Manual Loop
For fine-grained control over the agentic loop, define tools with `ToolParam`, check `StopReason`, execute tools yourself, and feed `tool_result` blocks back. This is the pattern when you need to intercept, validate, or log tool calls.
Derived from `anthropic-sdk-go/examples/tools/main.go`.
```go
package main
import (
"context"
"encoding/json"
"fmt"
"log"
"github.com/anthropics/anthropic-sdk-go"
)
func main() {
client := anthropic.NewClient()
// 1. Define tools. ToolParam.InputSchema uses a map, no struct tags needed.
addTool := anthropic.ToolParam{
Name: "add",
Description: anthropic.String("Add two integers"),
InputSchema: anthropic.ToolInputSchemaParam{
Properties: map[string]any{
"a": map[string]any{"type": "integer"},
"b": map[string]any{"type": "integer"},
},
},
}
// ToolParam must be wrapped in ToolUnionParam for the Tools slice
tools := []anthropic.ToolUnionParam{{OfTool: &addTool}}
messages := []anthropic.MessageParam{
anthropic.NewUserMessage(anthropic.NewTextBlock("What is 2 + 3?")),
}
for {
resp, err := client.Messages.New(context.Background(), anthropic.MessageNewParams{
Model: anthropic.ModelClaudeSonnet4_6,
MaxTokens: 16000,
Messages: messages,
Tools: tools,
})
if err != nil {
log.Fatal(err)
}
// 2. Append the assistant response to history BEFORE processing tool calls.
// resp.ToParam() converts Message → MessageParam in one call.
messages = append(messages, resp.ToParam())
// 3. Walk content blocks. ContentBlockUnion is a flattened struct;
// use block.AsAny().(type) to switch on the actual variant.
toolResults := []anthropic.ContentBlockParamUnion{}
for _, block := range resp.Content {
switch variant := block.AsAny().(type) {
case anthropic.TextBlock:
fmt.Println(variant.Text)
case anthropic.ToolUseBlock:
// 4. Parse the tool input. Use variant.JSON.Input.Raw() to get the
// raw JSON — block.Input is json.RawMessage, not the parsed value.
var in struct {
A int `json:"a"`
B int `json:"b"`
}
if err := json.Unmarshal([]byte(variant.JSON.Input.Raw()), &in); err != nil {
log.Fatal(err)
}
result := fmt.Sprintf("%d", in.A+in.B)
// 5. NewToolResultBlock(toolUseID, content, isError) builds the
// ContentBlockParamUnion for you. block.ID is the tool_use_id.
toolResults = append(toolResults,
anthropic.NewToolResultBlock(block.ID, result, false))
}
}
// 6. Exit when Claude stops asking for tools
if resp.StopReason != anthropic.StopReasonToolUse {
break
}
// 7. Tool results go in a user message (variadic: all results in one turn)
messages = append(messages, anthropic.NewUserMessage(toolResults...))
}
}
```
**Key API surface:**
| Symbol | Purpose |
|---|---|
| `resp.ToParam()` | Convert `Message` response → `MessageParam` for history |
| `block.AsAny().(type)` | Type-switch on `ContentBlockUnion` variants |
| `variant.JSON.Input.Raw()` | Raw JSON string of tool input (for `json.Unmarshal`) |
| `anthropic.NewToolResultBlock(id, content, isError)` | Build `tool_result` block |
| `anthropic.NewUserMessage(blocks...)` | Wrap tool results as a user turn |
| `anthropic.StopReasonToolUse` | `StopReason` constant to check loop termination |
| `anthropic.ToolUnionParam{OfTool: &t}` | Wrap `ToolParam` in the union for `Tools:` |
---
## Anthropic-Defined Tools
Version-suffixed struct names with `Param` suffix. `Name`/`Type` are `constant.*` types — zero value marshals correctly, so `{}` works. Wrap in `ToolUnionParam` with the matching `Of*` field. Web search and code execution are server-executed; bash and text editor are client-executed (you handle the `tool_use` locally — see `shared/tool-use-concepts.md`).
```go
Tools: []anthropic.ToolUnionParam{
{OfWebSearchTool20260209: &anthropic.WebSearchTool20260209Param{}},
{OfBashTool20250124: &anthropic.ToolBash20250124Param{}},
{OfTextEditor20250728: &anthropic.ToolTextEditor20250728Param{}},
{OfCodeExecutionTool20260120: &anthropic.CodeExecutionTool20260120Param{}},
},
```
Also available: `WebFetchTool20260209Param`, `ToolSearchToolBm25_20251119Param`, `ToolSearchToolRegex20251119Param`. For the advisor and memory tools, use `BetaAdvisorTool20260301Param` / `BetaMemoryTool20250818Param` in the beta namespace on `client.Beta.Messages.New`.
### Advisor tool (beta)
Server-side — no tool_result round-trip. The advisor model must be ≥ the executor (top-level) model; invalid pairs return 400.
```go
response, err := client.Beta.Messages.New(ctx, anthropic.BetaMessageNewParams{
Model: anthropic.ModelClaudeSonnet4_6,
MaxTokens: 4096,
Tools: []anthropic.BetaToolUnionParam{
{OfAdvisorTool20260301: &anthropic.BetaAdvisorTool20260301Param{
Model: anthropic.ModelClaudeOpus4_8,
}},
},
Messages: []anthropic.BetaMessageParam{ /* ... */ },
Betas: []anthropic.AnthropicBeta{anthropic.AnthropicBetaAdvisorTool2026_03_01},
})
```
---

View File

@@ -38,9 +38,11 @@ ctx := context.Background()
```go
environment, err := client.Beta.Environments.New(ctx, anthropic.BetaEnvironmentNewParams{
Name: "my-dev-env",
Config: anthropic.BetaCloudConfigParams{
Config: anthropic.BetaEnvironmentNewParamsConfigUnion{
OfCloud: &anthropic.BetaCloudConfigParams{
Networking: anthropic.BetaCloudConfigParamsNetworkingUnion{
OfUnrestricted: &anthropic.UnrestrictedNetworkParam{},
OfUnrestricted: &anthropic.BetaUnrestrictedNetworkParam{},
},
},
},
})
@@ -132,7 +134,7 @@ if err != nil {
```go
_, err = client.Beta.Sessions.Events.Send(ctx, session.ID, anthropic.BetaSessionEventSendParams{
Events: []anthropic.SendEventsParamsUnion{{
Events: []anthropic.BetaManagedAgentsEventParamsUnion{{
OfUserMessage: &anthropic.BetaManagedAgentsUserMessageEventParams{
Type: anthropic.BetaManagedAgentsUserMessageEventParamsTypeUserMessage,
Content: []anthropic.BetaManagedAgentsUserMessageEventParamsContentUnion{{
@@ -161,7 +163,7 @@ stream := client.Beta.Sessions.Events.StreamEvents(ctx, session.ID, anthropic.Be
defer stream.Close()
if _, err := client.Beta.Sessions.Events.Send(ctx, session.ID, anthropic.BetaSessionEventSendParams{
Events: []anthropic.SendEventsParamsUnion{{
Events: []anthropic.BetaManagedAgentsEventParamsUnion{{
OfUserMessage: &anthropic.BetaManagedAgentsUserMessageEventParams{
Type: anthropic.BetaManagedAgentsUserMessageEventParamsTypeUserMessage,
Content: []anthropic.BetaManagedAgentsUserMessageEventParamsContentUnion{{
@@ -383,8 +385,8 @@ agent, err := client.Beta.Agents.New(ctx, anthropic.BetaAgentNewParams{
ID: "claude-opus-4-8",
Type: anthropic.BetaManagedAgentsModelConfigParamsTypeModelConfig,
},
MCPServers: []anthropic.BetaManagedAgentsUrlmcpServerParams{{
Type: anthropic.BetaManagedAgentsUrlmcpServerParamsTypeURL,
MCPServers: []anthropic.BetaManagedAgentsURLMCPServerParams{{
Type: anthropic.BetaManagedAgentsURLMCPServerParamsTypeURL,
Name: "github",
URL: "https://api.githubcopilot.com/mcp/",
}},

View File

@@ -0,0 +1,238 @@
# Claude API — Java
> **Note:** The Java SDK supports the Claude API and beta tool use with annotated classes. Agent SDK is not yet available for Java.
## Package Reference
Types are organized by package. If a class you need isn't shown in an example below, locate it via this table first — don't block on fetching SDK source over the network.
| `import` prefix | Contains |
|---|---|
| `com.anthropic.client` / `com.anthropic.client.okhttp` | `AnthropicClient`, `AnthropicOkHttpClient` |
| `com.anthropic.models.messages` | non-beta request/response types — `MessageCreateParams`, `Model`, `Message`, `TextBlockParam`, `ContentBlockParam`, `ToolUseBlockParam`, `ToolResultBlockParam`, `CacheControlEphemeral`, `Tool*` (e.g. `ToolBash20250124`, `ToolTextEditor20250728`), `StopReason`, `StructuredMessage*` |
| `com.anthropic.models.messages.batches` | Batch API — `BatchResultsParams`, `MessageBatchIndividualResponse` |
| `com.anthropic.models.beta` | `AnthropicBeta` (beta-flag constants) |
| `com.anthropic.models.beta.messages` | beta-endpoint types — `MessageCreateParams`, `BetaMessage`, `BetaStopReason`, `BetaContextManagementConfig`, `BetaMcpToolset`, `BetaRequestMcpServerUrlDefinition`, `BetaTool*` |
| `com.anthropic.core` | `JsonValue`, `JsonField`, `JsonSchemaLocalValidation`, `com.anthropic.core.http.StreamResponse` |
| `com.anthropic.errors` | typed exceptions — `AnthropicServiceException`, `RateLimitException`, `NotFoundException`, etc. (see `shared/error-codes.md`) |
`client.messages()` uses `com.anthropic.models.messages.*`; `client.beta().messages()` uses `com.anthropic.models.beta.messages.*`. Both packages define a `MessageCreateParams` — import the one matching the client path you call.
### Key types per feature
Write from this table instead of `javap`/jar inspection. Endpoint column tells you whether to use `client.messages()` or `client.beta().messages()`.
| Feature | Endpoint | Key Java types / builder calls |
|---|---|---|
| User profiles | beta | `client.beta().userProfiles().create(...)` / `.retrieve(id)` / `.list()`. Pass the returned profile id on the beta `MessageCreateParams`. Requires a beta header — check the SDK's beta-headers reference for the current flag. |
| Agent Skills | beta | `BetaContainerParams`, `BetaSkillParams`, `BetaCodeExecutionTool20250825`. `.addBeta("code-execution-2025-08-25").addBeta("skills-2025-10-02")`. Download the output via `client.beta().files().download(fileId)`. |
| Cache diagnostics | beta | `BetaDiagnosticsParam`, `BetaCacheControlEphemeral` |
| Context editing | beta | `.contextManagement(BetaContextManagementConfig.builder()…)`. The edit strategy is a `BetaClearToolUses20250919Edit` (or `BetaClearThinking20251015Edit`); its trigger is a `BetaInputTokensTrigger` built separately and passed to the edit's builder — there is no direct `.inputTokensTrigger(N)` shortcut on the edit builder. `javap` the edit and trigger classes for the exact setter names. |
| Memory tool | non-beta | `.addTool(MemoryTool20250818.builder().build())` from `com.anthropic.models.messages` |
| Programmatic tool calling | non-beta | `CodeExecutionTool20260120`, `Tool`, `ContentBlockParam` |
| Strict tool use | non-beta | `Tool`, `Tool.InputSchema` |
| Task budgets | beta | `.outputConfig(BetaOutputConfig.builder().taskBudget(BetaTokenTaskBudget.builder()...))` |
| Tool search | non-beta | `.addTool(ToolSearchToolRegex20251119.builder()...)` from `com.anthropic.models.messages` |
| Web search | non-beta | `WebSearchTool20260209` from `com.anthropic.models.messages` — the latest variant with dynamic filtering (Opus 4.8/4.7/4.6 + Sonnet 4.6). For older models or Vertex, use `WebSearchTool20250305` |
### Discovering type and member names
If a class or builder method you need isn't in the tables above, `jar tf <anthropic-java-core jar> | grep -i <term>` or `javap -classpath <jar> com.anthropic.models.…` is fast enough to locate names. **Do not compile and run a separate reflection program** to enumerate members — the first build is slow enough to be backgrounded in many environments, trapping you in a polling loop. Write the script with the names you found and let the compiler error (`cannot find symbol`) point at any wrong member.
## Installation
Maven:
```xml
<dependency>
<groupId>com.anthropic</groupId>
<artifactId>anthropic-java</artifactId>
<version>2.34.0</version>
</dependency>
```
Gradle:
```groovy
implementation("com.anthropic:anthropic-java:2.34.0")
```
## Client Initialization
```java
import com.anthropic.client.AnthropicClient;
import com.anthropic.client.okhttp.AnthropicOkHttpClient;
// Default (reads ANTHROPIC_API_KEY from environment)
AnthropicClient client = AnthropicOkHttpClient.fromEnv();
// Explicit API key
AnthropicClient client = AnthropicOkHttpClient.builder()
.apiKey("your-api-key")
.build();
```
---
## Basic Message Request
```java
import com.anthropic.models.messages.MessageCreateParams;
import com.anthropic.models.messages.Message;
import com.anthropic.models.messages.Model;
MessageCreateParams params = MessageCreateParams.builder()
.model(Model.CLAUDE_OPUS_4_8)
.maxTokens(16000L)
.addUserMessage("What is the capital of France?")
.build();
Message response = client.messages().create(params);
response.content().stream()
.flatMap(block -> block.text().stream())
.forEach(textBlock -> System.out.println(textBlock.text()));
```
---
## Thinking
**Adaptive thinking is the recommended mode for Claude 4.6+ models.** Claude decides dynamically when and how much to think. The builder has a direct `.thinking(ThinkingConfigAdaptive)` overload — no manual union wrapping.
> **Fable 5, Opus 4.8, Opus 4.7, Opus 4.6, and Sonnet 4.6:** Use adaptive thinking (below). `ThinkingConfigEnabled.builder().budgetTokens(N)` is removed on Fable 5, Opus 4.8, and 4.7 (400 if sent); deprecated on Opus 4.6 and Sonnet 4.6.
> **Older models:** Use `.thinking(ThinkingConfigEnabled.builder().budgetTokens(N).build())` (budget must be < `maxTokens`, min 1024).
```java
import com.anthropic.models.messages.ContentBlock;
import com.anthropic.models.messages.MessageCreateParams;
import com.anthropic.models.messages.Model;
import com.anthropic.models.messages.ThinkingConfigAdaptive;
MessageCreateParams params = MessageCreateParams.builder()
.model(Model.CLAUDE_SONNET_4_6)
.maxTokens(16000L)
.thinking(ThinkingConfigAdaptive.builder().build())
.addUserMessage("Solve this step by step: 27 * 453")
.build();
for (ContentBlock block : client.messages().create(params).content()) {
block.thinking().ifPresent(t -> System.out.println("[thinking] " + t.thinking()));
block.text().ifPresent(t -> System.out.println(t.text()));
}
```
`ContentBlock` narrowing: `.thinking()` / `.text()` return `Optional<T>` — use `.ifPresent(...)` or `.stream().flatMap(...)`. Alternative: `isThinking()` / `asThinking()` boolean+unwrap pairs (throws on wrong variant).
---
## Effort Parameter
Effort is nested inside `OutputConfig` — there is NO `.effort()` directly on `MessageCreateParams.Builder`.
```java
import com.anthropic.models.messages.OutputConfig;
.outputConfig(OutputConfig.builder()
.effort(OutputConfig.Effort.HIGH) // or LOW, MEDIUM, MAX
.build())
```
Combine with `Thinking = ThinkingConfigAdaptive` for cost-quality control.
---
## Prompt Caching
System message as a list of `TextBlockParam` with `CacheControlEphemeral`. Use `.systemOfTextBlockParams(...)` — the plain `.system(String)` overload can't carry cache control. For placement patterns and the silent-invalidator audit checklist, see `shared/prompt-caching.md`.
```java
import com.anthropic.models.messages.TextBlockParam;
import com.anthropic.models.messages.CacheControlEphemeral;
.systemOfTextBlockParams(List.of(
TextBlockParam.builder()
.text(longSystemPrompt)
.cacheControl(CacheControlEphemeral.builder()
.ttl(CacheControlEphemeral.Ttl.TTL_1H) // optional; also TTL_5M
.build())
.build()))
```
There's also a top-level `.cacheControl(CacheControlEphemeral)` on `MessageCreateParams.Builder` and on `Tool.builder()`.
Verify hits via `response.usage().cacheCreationInputTokens()` / `response.usage().cacheReadInputTokens()`.
---
## Token Counting
```java
import com.anthropic.models.messages.MessageCountTokensParams;
long tokens = client.messages().countTokens(
MessageCountTokensParams.builder()
.model(Model.CLAUDE_SONNET_4_6)
.addUserMessage("Hello")
.build()
).inputTokens();
```
---
## PDF / Document Input
`DocumentBlockParam` builder has source shortcuts. Wrap in `ContentBlockParam.ofDocument()` and pass via `.addUserMessageOfBlockParams()`.
```java
import com.anthropic.models.messages.DocumentBlockParam;
import com.anthropic.models.messages.ContentBlockParam;
import com.anthropic.models.messages.TextBlockParam;
DocumentBlockParam doc = DocumentBlockParam.builder()
.source(Base64PdfSource.builder().data(base64String).build())
// or .source(UrlPdfSource.builder().url("https://...").build())
.title("My Document") // optional
.build();
```
For **Files API** document references, use the beta path and beta types — see `files-api.md`: `BetaRequestDocumentBlock.builder().source(BetaFileDocumentSource.builder().fileId(id).build())`.
```java
.addUserMessageOfBlockParams(List.of(
ContentBlockParam.ofDocument(doc),
ContentBlockParam.ofText(TextBlockParam.builder().text("Summarize this").build())))
```
---
## Stop Details
When `stopReason()` is `"refusal"`, the response includes structured `stopDetails()`:
```java
response.stopDetails().ifPresent(details -> {
System.out.println("Category: " + details.category());
System.out.println("Explanation: " + details.explanation());
});
```
**Refusal fallbacks (Claude Fable 5) — opt in by default.** Fallbacks are opt-in: without them a refused request simply stops. New `claude-fable-5` code should include the server-side `fallbacks` parameter (beta header `server-side-fallback-2026-06-01`, fallback model `claude-opus-4-8`, on the beta messages call) by default. The exact Java builder methods (and the client-side middleware for providers without server-side support) are not documented here — WebFetch the Java SDK repo's `examples/` from `shared/live-sources.md`; full semantics in `shared/model-migration.md` → Migrating to Claude Fable 5 → `refusal` stop reason.
---
## Error Type
`AnthropicServiceException` exposes `.errorType()` returning `Optional<ErrorType>` for programmatic error classification:
```java
try {
client.messages().create(params);
} catch (AnthropicServiceException e) {
e.errorType().ifPresent(type ->
System.out.println("Error type: " + type) // RATE_LIMIT_ERROR, OVERLOADED_ERROR, etc.
);
}
```
---

View File

@@ -0,0 +1,25 @@
# Files API — Java
## Files API (Beta)
Under `client.beta().files()`. File references in messages need the beta message types (non-beta `DocumentBlockParam.Source` has no file-ID variant).
```java
import com.anthropic.models.beta.files.FileUploadParams;
import com.anthropic.models.beta.files.FileMetadata;
import com.anthropic.models.beta.messages.BetaRequestDocumentBlock;
import com.anthropic.models.beta.messages.BetaFileDocumentSource;
import java.nio.file.Paths;
FileMetadata meta = client.beta().files().upload(
FileUploadParams.builder()
.file(Paths.get("/path/to/doc.pdf")) // or .file(InputStream) or .file(byte[])
.build());
// Reference in a beta message:
BetaRequestDocumentBlock doc = BetaRequestDocumentBlock.builder()
.source(BetaFileDocumentSource.builder().fileId(meta.id()).build())
.build();
```
Other methods: `.list()`, `.delete(String fileId)`, `.download(String fileId)`, `.retrieveMetadata(String fileId)`.

View File

@@ -0,0 +1,24 @@
# Streaming — Java
## Streaming
```java
import com.anthropic.core.http.StreamResponse;
import com.anthropic.models.messages.RawMessageStreamEvent;
MessageCreateParams params = MessageCreateParams.builder()
.model(Model.CLAUDE_OPUS_4_8)
.maxTokens(64000L)
.addUserMessage("Write a haiku")
.build();
try (StreamResponse<RawMessageStreamEvent> streamResponse = client.messages().createStreaming(params)) {
streamResponse.stream()
.flatMap(event -> event.contentBlockDelta().stream())
.flatMap(deltaEvent -> deltaEvent.delta().text().stream())
.forEach(textDelta -> System.out.print(textDelta.text()));
}
```
---

View File

@@ -1,113 +1,6 @@
# Claude API — Java
# Tool Use — Java
> **Note:** The Java SDK supports the Claude API and beta tool use with annotated classes. Agent SDK is not yet available for Java.
## Installation
Maven:
```xml
<dependency>
<groupId>com.anthropic</groupId>
<artifactId>anthropic-java</artifactId>
<version>2.34.0</version>
</dependency>
```
Gradle:
```groovy
implementation("com.anthropic:anthropic-java:2.34.0")
```
## Client Initialization
```java
import com.anthropic.client.AnthropicClient;
import com.anthropic.client.okhttp.AnthropicOkHttpClient;
// Default (reads ANTHROPIC_API_KEY from environment)
AnthropicClient client = AnthropicOkHttpClient.fromEnv();
// Explicit API key
AnthropicClient client = AnthropicOkHttpClient.builder()
.apiKey("your-api-key")
.build();
```
---
## Basic Message Request
```java
import com.anthropic.models.messages.MessageCreateParams;
import com.anthropic.models.messages.Message;
import com.anthropic.models.messages.Model;
MessageCreateParams params = MessageCreateParams.builder()
.model(Model.CLAUDE_OPUS_4_6)
.maxTokens(16000L)
.addUserMessage("What is the capital of France?")
.build();
Message response = client.messages().create(params);
response.content().stream()
.flatMap(block -> block.text().stream())
.forEach(textBlock -> System.out.println(textBlock.text()));
```
---
## Streaming
```java
import com.anthropic.core.http.StreamResponse;
import com.anthropic.models.messages.RawMessageStreamEvent;
MessageCreateParams params = MessageCreateParams.builder()
.model(Model.CLAUDE_OPUS_4_6)
.maxTokens(64000L)
.addUserMessage("Write a haiku")
.build();
try (StreamResponse<RawMessageStreamEvent> streamResponse = client.messages().createStreaming(params)) {
streamResponse.stream()
.flatMap(event -> event.contentBlockDelta().stream())
.flatMap(deltaEvent -> deltaEvent.delta().text().stream())
.forEach(textDelta -> System.out.print(textDelta.text()));
}
```
---
## Thinking
**Adaptive thinking is the recommended mode for Claude 4.6+ models.** Claude decides dynamically when and how much to think. The builder has a direct `.thinking(ThinkingConfigAdaptive)` overload — no manual union wrapping.
```java
import com.anthropic.models.messages.ContentBlock;
import com.anthropic.models.messages.MessageCreateParams;
import com.anthropic.models.messages.Model;
import com.anthropic.models.messages.ThinkingConfigAdaptive;
MessageCreateParams params = MessageCreateParams.builder()
.model(Model.CLAUDE_SONNET_4_6)
.maxTokens(16000L)
.thinking(ThinkingConfigAdaptive.builder().build())
.addUserMessage("Solve this step by step: 27 * 453")
.build();
for (ContentBlock block : client.messages().create(params).content()) {
block.thinking().ifPresent(t -> System.out.println("[thinking] " + t.thinking()));
block.text().ifPresent(t -> System.out.println(t.text()));
}
```
> **Deprecated:** `ThinkingConfigEnabled.builder().budgetTokens(N)` (and the `.enabledThinking(N)` shortcut) still works on Claude 4.6 but is deprecated. Use adaptive thinking above.
`ContentBlock` narrowing: `.thinking()` / `.text()` return `Optional<T>` — use `.ifPresent(...)` or `.stream().flatMap(...)`. Alternative: `isThinking()` / `asThinking()` boolean+unwrap pairs (throws on wrong variant).
---
For conceptual overview (tool definitions, tool choice, tips), see [shared/tool-use-concepts.md](../../shared/tool-use-concepts.md).
## Tool Use (Beta)
@@ -181,7 +74,7 @@ for (BetaMessage message : toolRunner) {
}
```
See the [shared memory tool concepts](../shared/tool-use-concepts.md) for more details on the memory tool.
See the [shared memory tool concepts](../../shared/tool-use-concepts.md) for more details on the memory tool.
### Non-Beta Tool Declaration (manual JSON schema)
@@ -210,7 +103,7 @@ MessageCreateParams params = MessageCreateParams.builder()
.build();
```
For manual tool loops, handle `tool_use` blocks in the response, send `tool_result` back, loop until `stop_reason` is `"end_turn"`. See [shared tool use concepts](../shared/tool-use-concepts.md).
For manual tool loops, handle `tool_use` blocks in the response, send `tool_result` back, loop until `stop_reason` is `"end_turn"`. See [shared tool use concepts](../../shared/tool-use-concepts.md).
### Building `MessageParam` with Content Blocks (Tool Result Round-Trip)
@@ -236,60 +129,6 @@ MessageParam toolResultMsg = MessageParam.builder()
---
## Effort Parameter
Effort is nested inside `OutputConfig` — there is NO `.effort()` directly on `MessageCreateParams.Builder`.
```java
import com.anthropic.models.messages.OutputConfig;
.outputConfig(OutputConfig.builder()
.effort(OutputConfig.Effort.HIGH) // or LOW, MEDIUM, MAX
.build())
```
Combine with `Thinking = ThinkingConfigAdaptive` for cost-quality control.
---
## Prompt Caching
System message as a list of `TextBlockParam` with `CacheControlEphemeral`. Use `.systemOfTextBlockParams(...)` — the plain `.system(String)` overload can't carry cache control. For placement patterns and the silent-invalidator audit checklist, see `shared/prompt-caching.md`.
```java
import com.anthropic.models.messages.TextBlockParam;
import com.anthropic.models.messages.CacheControlEphemeral;
.systemOfTextBlockParams(List.of(
TextBlockParam.builder()
.text(longSystemPrompt)
.cacheControl(CacheControlEphemeral.builder()
.ttl(CacheControlEphemeral.Ttl.TTL_1H) // optional; also TTL_5M
.build())
.build()))
```
There's also a top-level `.cacheControl(CacheControlEphemeral)` on `MessageCreateParams.Builder` and on `Tool.builder()`.
Verify hits via `response.usage().cacheCreationInputTokens()` / `response.usage().cacheReadInputTokens()`.
---
## Token Counting
```java
import com.anthropic.models.messages.MessageCountTokensParams;
long tokens = client.messages().countTokens(
MessageCountTokensParams.builder()
.model(Model.CLAUDE_SONNET_4_6)
.addUserMessage("Hello")
.build()
).inputTokens();
```
---
## Structured Output
The class-based overload auto-derives the JSON schema from your POJO and gives you a typed `.text()` return — no manual schema, no manual parsing.
@@ -319,30 +158,9 @@ Supports Jackson annotations: `@JsonPropertyDescription`, `@JsonIgnore`, `@Array
---
## PDF / Document Input
## Anthropic-Defined Tools
`DocumentBlockParam` builder has source shortcuts. Wrap in `ContentBlockParam.ofDocument()` and pass via `.addUserMessageOfBlockParams()`.
```java
import com.anthropic.models.messages.DocumentBlockParam;
import com.anthropic.models.messages.ContentBlockParam;
import com.anthropic.models.messages.TextBlockParam;
DocumentBlockParam doc = DocumentBlockParam.builder()
.base64Source(base64String) // or .urlSource("https://...") or .textSource("...")
.title("My Document") // optional
.build();
.addUserMessageOfBlockParams(List.of(
ContentBlockParam.ofDocument(doc),
ContentBlockParam.ofText(TextBlockParam.builder().text("Summarize this").build())))
```
---
## Server-Side Tools
Version-suffixed types; `name`/`type` auto-set by builder. Direct `.addTool()` overloads exist for every type — no manual `ToolUnion` wrapping.
Version-suffixed types; `name`/`type` auto-set by builder. Direct `.addTool()` overloads exist for most tool types; where one is missing (newer or less-common tools — see the advisor note below), wrap via the union type's static factory: `.addTool(BetaToolUnion.of<ToolName>(builder…build()))`. Web search and code execution are server-executed; bash and text editor are client-executed (you handle the `tool_use` locally — see `shared/tool-use-concepts.md`).
```java
import com.anthropic.models.messages.WebSearchTool20260209;
@@ -359,7 +177,7 @@ import com.anthropic.models.messages.CodeExecutionTool20260120;
.addTool(CodeExecutionTool20260120.builder().build())
```
Also available: `WebFetchTool20260209`, `MemoryTool20250818`, `ToolSearchToolBm25_20251119`. For the advisor tool, use `BetaAdvisorTool20260301` in the beta namespace.
Also available: `WebFetchTool20260209`, `MemoryTool20250818`, `ToolSearchToolBm25_20251119`. For the advisor tool, use `BetaAdvisorTool20260301` in the beta namespace with `.addBeta("advisor-tool-2026-03-01")` (server-side; advisor model ≥ executor model). There is no direct `.addTool(BetaAdvisorTool20260301)` overload on the beta builder — wrap it via the `BetaToolUnion` static factory for the advisor type; if `javac` rejects the specific factory method name, `javap com.anthropic.models.beta.messages.BetaToolUnion | grep -i advisor` shows the exact one.
### Beta namespace (MCP, compaction)
@@ -372,7 +190,7 @@ import com.anthropic.models.beta.messages.BetaCodeExecutionTool20260120;
import com.anthropic.models.beta.messages.BetaRequestMcpServerUrlDefinition;
MessageCreateParams params = MessageCreateParams.builder()
.model(Model.CLAUDE_OPUS_4_6)
.model(Model.CLAUDE_OPUS_4_8)
.maxTokens(16000L)
.addBeta("mcp-client-2025-11-20")
.addTool(BetaToolBash20250124.builder().build())
@@ -408,54 +226,3 @@ for (ContentBlock block : response.content()) {
---
## Stop Details
When `stopReason()` is `"refusal"`, the response includes structured `stopDetails()`:
```java
response.stopDetails().ifPresent(details -> {
System.out.println("Category: " + details.category());
System.out.println("Explanation: " + details.explanation());
});
```
---
## Error Type
`AnthropicServiceException` exposes `.errorType()` returning `Optional<ErrorType>` for programmatic error classification:
```java
try {
client.messages().create(params);
} catch (AnthropicServiceException e) {
e.errorType().ifPresent(type ->
System.out.println("Error type: " + type) // RATE_LIMIT_ERROR, OVERLOADED_ERROR, etc.
);
}
```
---
## Files API (Beta)
Under `client.beta().files()`. File references in messages need the beta message types (non-beta `DocumentBlockParam.Source` has no file-ID variant).
```java
import com.anthropic.models.beta.files.FileUploadParams;
import com.anthropic.models.beta.files.FileMetadata;
import com.anthropic.models.beta.messages.BetaRequestDocumentBlock;
import java.nio.file.Paths;
FileMetadata meta = client.beta().files().upload(
FileUploadParams.builder()
.file(Paths.get("/path/to/doc.pdf")) // or .file(InputStream) or .file(byte[])
.build());
// Reference in a beta message:
BetaRequestDocumentBlock doc = BetaRequestDocumentBlock.builder()
.fileSource(meta.id())
.build();
```
Other methods: `.list()`, `.delete(String fileId)`, `.download(String fileId)`, `.retrieveMetadata(String fileId)`.

View File

@@ -28,13 +28,13 @@ var client = AnthropicOkHttpClient.fromEnv();
```java
import com.anthropic.models.beta.environments.BetaCloudConfigParams;
import com.anthropic.models.beta.environments.BetaUnrestrictedNetwork;
import com.anthropic.models.beta.environments.EnvironmentCreateParams;
import com.anthropic.models.beta.environments.UnrestrictedNetwork;
var environment = client.beta().environments().create(EnvironmentCreateParams.builder()
.name("my-dev-env")
.config(BetaCloudConfigParams.builder()
.networking(UnrestrictedNetwork.builder().build())
.networking(BetaUnrestrictedNetwork.builder().build())
.build())
.build());
System.out.println("Environment ID: " + environment.id()); // env_...
@@ -290,14 +290,14 @@ client.beta().sessions().delete(session.id());
```java
import com.anthropic.models.beta.agents.BetaManagedAgentsMcpToolsetParams;
import com.anthropic.models.beta.agents.BetaManagedAgentsUrlmcpServerParams;
import com.anthropic.models.beta.agents.BetaManagedAgentsUrlMcpServerParams;
// Agent declares MCP server (no auth here — auth goes in a vault)
var agent = client.beta().agents().create(AgentCreateParams.builder()
.name("GitHub Assistant")
.model("claude-opus-4-8")
.addMcpServer(BetaManagedAgentsUrlmcpServerParams.builder()
.type(BetaManagedAgentsUrlmcpServerParams.Type.URL)
.addMcpServer(BetaManagedAgentsUrlMcpServerParams.builder()
.type(BetaManagedAgentsUrlMcpServerParams.Type.URL)
.name("github")
.url("https://api.githubcopilot.com/mcp/")
.build())

View File

@@ -0,0 +1,173 @@
# Claude API — PHP
> **Note:** The PHP SDK is the official Anthropic SDK for PHP. A beta tool runner is available via `$client->beta->messages->toolRunner()`. Structured output helpers are supported via `StructuredOutputModel` classes. Agent SDK is not available. Bedrock, Vertex AI, and Foundry clients are supported.
## Installation
```bash
composer require "anthropic-ai/sdk"
```
## Client Initialization
```php
use Anthropic\Client;
// Using API key from environment variable
$client = new Client(apiKey: getenv("ANTHROPIC_API_KEY"));
```
### Amazon Bedrock
```php
use Anthropic\Bedrock\MantleClient;
// Messages-API Bedrock endpoint. Reads AWS credentials from env.
$client = new MantleClient(awsRegion: 'us-east-1');
```
Model IDs on Bedrock take an `anthropic.` prefix — e.g. `model: 'anthropic.claude-opus-4-8'`.
### Google Vertex AI
```php
use Anthropic\Vertex;
// Constructor is private. Parameter is `location`, not `region`.
$client = Vertex\Client::fromEnvironment(
location: 'us-east5',
projectId: 'my-project-id',
);
```
### Anthropic Foundry
```php
use Anthropic\Foundry;
// Constructor is private. baseUrl or resource is required.
$client = Foundry\Client::withCredentials(
apiKey: getenv('ANTHROPIC_FOUNDRY_API_KEY'),
baseUrl: 'https://<resource>.services.ai.azure.com/anthropic/v1',
);
```
---
## Basic Message Request
```php
$message = $client->messages->create(
model: 'claude-opus-4-8',
maxTokens: 16000,
messages: [
['role' => 'user', 'content' => 'What is the capital of France?'],
],
);
// content is an array of polymorphic blocks (TextBlock, ToolUseBlock,
// ThinkingBlock). Accessing ->text on content[0] without checking the block
// type will throw if the first block is not a TextBlock (e.g., when extended
// thinking is enabled and a ThinkingBlock comes first). Always guard:
foreach ($message->content as $block) {
if ($block->type === 'text') {
echo $block->text;
}
}
```
If you only want the first text block:
```php
foreach ($message->content as $block) {
if ($block->type === 'text') {
echo $block->text;
break;
}
}
```
---
## Extended Thinking
**Adaptive thinking is the recommended mode for Claude 4.6+ models.** Claude decides dynamically when and how much to think.
```php
use Anthropic\Messages\ThinkingBlock;
$message = $client->messages->create(
model: 'claude-opus-4-8',
maxTokens: 16000,
thinking: ['type' => 'adaptive', 'display' => 'summarized'], // display opt-in: default is omitted (empty thinking text) on Fable 5 / Mythos 5 / Opus 4.8 / 4.7
messages: [
['role' => 'user', 'content' => 'Solve: 27 * 453'],
],
);
// ThinkingBlock(s) precede TextBlock in content
foreach ($message->content as $block) {
if ($block instanceof ThinkingBlock) {
echo "Thinking:\n{$block->thinking}\n\n";
// $block->signature is an opaque string — preserve verbatim if
// passing thinking blocks back in multi-turn conversations
} elseif ($block->type === 'text') {
echo "Answer: {$block->text}\n";
}
}
```
> **Fable 5, Opus 4.8, Opus 4.7, Opus 4.6, and Sonnet 4.6:** Use adaptive thinking (above). `['type' => 'enabled', 'budgetTokens' => N]` is removed on Fable 5, Opus 4.8, and 4.7 (400 if sent); deprecated on Opus 4.6 and Sonnet 4.6.
> **Older models:** Use `thinking: ['type' => 'enabled', 'budgetTokens' => N]` (budget must be < `maxTokens`, min 1024).
`$block->type === 'thinking'` also works for the check; `instanceof` narrows for PHPStan.
---
## Prompt Caching
`system:` takes an array of text blocks; set `cacheControl` on the last block. Array-shape syntax (camelCase keys) is idiomatic. For placement patterns and the silent-invalidator audit checklist, see `shared/prompt-caching.md`.
```php
$message = $client->messages->create(
model: 'claude-opus-4-8',
maxTokens: 16000,
system: [
['type' => 'text', 'text' => $longSystemPrompt, 'cacheControl' => ['type' => 'ephemeral']],
],
messages: [['role' => 'user', 'content' => 'Summarize the key points']],
);
```
For 1-hour TTL: `'cacheControl' => ['type' => 'ephemeral', 'ttl' => '1h']`. There's also a top-level `cacheControl:` on `messages->create(...)` that auto-places on the last cacheable block.
Verify hits via `$message->usage->cacheCreationInputTokens` / `$message->usage->cacheReadInputTokens`.
---
## Stop Details
When `stopReason` is `'refusal'`, the response includes structured `stopDetails`:
```php
if ($message->stopReason === 'refusal' && $message->stopDetails !== null) {
echo "Category: " . $message->stopDetails->category . "\n"; // e.g. "cyber", "bio", "reasoning_extraction", "frontier_llm", or null — see docs for the full set
echo "Explanation: " . $message->stopDetails->explanation . "\n";
}
```
**Refusal fallbacks (Claude Fable 5) — opt in by default.** Fallbacks are opt-in: without them a refused request simply stops. New `claude-fable-5` code should include the server-side `fallbacks` parameter (beta header `server-side-fallback-2026-06-01`, fallback model `claude-opus-4-8`, on the beta messages call) by default. The exact PHP binding (and the client-side middleware for providers without server-side support) is not documented here — WebFetch the PHP SDK repo's `examples/` from `shared/live-sources.md`; full semantics in `shared/model-migration.md` → Migrating to Claude Fable 5 → `refusal` stop reason.
---
## Error Type
`APIStatusException` exposes a `->type` property for programmatic error classification:
```php
try {
$client->messages->create(...);
} catch (\Anthropic\Core\Exceptions\APIStatusException $e) {
echo $e->type?->value; // "rate_limit_error", "overloaded_error", etc.
}
```

View File

@@ -0,0 +1,15 @@
# Message Batches — PHP
## Message Batches API
```php
$batch = $client->messages->batches->create(requests: [
['customId' => 'req-1', 'params' => ['model' => 'claude-opus-4-8', 'maxTokens' => 1024, 'messages' => [...]]],
['customId' => 'req-2', 'params' => [...]],
]);
// Poll $client->messages->batches->retrieve($batch->id) until processingStatus === 'ended',
// then iterate $client->messages->batches->results($batch->id).
```
---

View File

@@ -0,0 +1,11 @@
# Files API — PHP
## Files API
```php
$file = $client->beta->files->upload(
file: fopen('upload_me.txt', 'r'),
betas: ['files-api-2025-04-14'],
);
// Reference $file->id as a file content block on ->beta->messages->create().
```

View File

@@ -0,0 +1,27 @@
# Streaming — PHP
## Streaming
> **Requires SDK v0.5.0+.** v0.4.0 and earlier used a single `$params` array; calling with named parameters throws `Unknown named parameter $model`. Upgrade: `composer require "anthropic-ai/sdk:^0.7"`
```php
use Anthropic\Messages\RawContentBlockDeltaEvent;
use Anthropic\Messages\TextDelta;
$stream = $client->messages->createStream(
model: 'claude-opus-4-8',
maxTokens: 64000,
messages: [
['role' => 'user', 'content' => 'Write a haiku'],
],
);
foreach ($stream as $event) {
if ($event instanceof RawContentBlockDeltaEvent && $event->delta instanceof TextDelta) {
echo $event->delta->text;
}
}
```
---

View File

@@ -1,116 +1,6 @@
# Claude API — PHP
# Tool Use — PHP
> **Note:** The PHP SDK is the official Anthropic SDK for PHP. A beta tool runner is available via `$client->beta->messages->toolRunner()`. Structured output helpers are supported via `StructuredOutputModel` classes. Agent SDK is not available. Bedrock, Vertex AI, and Foundry clients are supported.
## Installation
```bash
composer require "anthropic-ai/sdk"
```
## Client Initialization
```php
use Anthropic\Client;
// Using API key from environment variable
$client = new Client(apiKey: getenv("ANTHROPIC_API_KEY"));
```
### Amazon Bedrock
```php
use Anthropic\Bedrock;
// Constructor is private — use the static factory. Reads AWS credentials from env.
$client = Bedrock\Client::fromEnvironment(region: 'us-east-1');
```
### Google Vertex AI
```php
use Anthropic\Vertex;
// Constructor is private. Parameter is `location`, not `region`.
$client = Vertex\Client::fromEnvironment(
location: 'us-east5',
projectId: 'my-project-id',
);
```
### Anthropic Foundry
```php
use Anthropic\Foundry;
// Constructor is private. baseUrl or resource is required.
$client = Foundry\Client::withCredentials(
authToken: getenv('ANTHROPIC_FOUNDRY_AUTH_TOKEN'),
baseUrl: 'https://<resource>.services.ai.azure.com/anthropic',
);
```
---
## Basic Message Request
```php
$message = $client->messages->create(
model: 'claude-opus-4-8',
maxTokens: 16000,
messages: [
['role' => 'user', 'content' => 'What is the capital of France?'],
],
);
// content is an array of polymorphic blocks (TextBlock, ToolUseBlock,
// ThinkingBlock). Accessing ->text on content[0] without checking the block
// type will throw if the first block is not a TextBlock (e.g., when extended
// thinking is enabled and a ThinkingBlock comes first). Always guard:
foreach ($message->content as $block) {
if ($block->type === 'text') {
echo $block->text;
}
}
```
If you only want the first text block:
```php
foreach ($message->content as $block) {
if ($block->type === 'text') {
echo $block->text;
break;
}
}
```
---
## Streaming
> **Requires SDK v0.5.0+.** v0.4.0 and earlier used a single `$params` array; calling with named parameters throws `Unknown named parameter $model`. Upgrade: `composer require "anthropic-ai/sdk:^0.7"`
```php
use Anthropic\Messages\RawContentBlockDeltaEvent;
use Anthropic\Messages\TextDelta;
$stream = $client->messages->createStream(
model: 'claude-opus-4-8',
maxTokens: 64000,
messages: [
['role' => 'user', 'content' => 'Write a haiku'],
],
);
foreach ($stream as $event) {
if ($event instanceof RawContentBlockDeltaEvent && $event->delta instanceof TextDelta) {
echo $event->delta->text;
}
}
```
---
For conceptual overview (tool definitions, tool choice, tips), see [shared/tool-use-concepts.md](../../shared/tool-use-concepts.md).
## Tool Use
@@ -125,7 +15,7 @@ $weatherTool = new BetaRunnableTool(
definition: [
'name' => 'get_weather',
'description' => 'Get the current weather for a location.',
'input_schema' => [
'inputSchema' => [
'type' => 'object',
'properties' => [
'location' => ['type' => 'string', 'description' => 'City and state'],
@@ -156,7 +46,7 @@ foreach ($runner as $message) {
### Manual Loop
Tools are passed as arrays. **The SDK uses camelCase keys** (`inputSchema`, `toolUseID`, `stopReason`) and auto-maps to the API's snake_case on the wire — since v0.5.0. See [shared tool use concepts](../shared/tool-use-concepts.md) for the loop pattern.
Tools are passed as arrays. **The SDK uses camelCase keys** (`inputSchema`, `toolUseID`, `stopReason`) and auto-maps to the API's snake_case on the wire — since v0.5.0. See [shared tool use concepts](../../shared/tool-use-concepts.md) for the loop pattern.
```php
use Anthropic\Messages\ToolUseBlock;
@@ -223,61 +113,6 @@ foreach ($response->content as $block) {
`$block->type === 'tool_use'` also works; `instanceof ToolUseBlock` narrows for PHPStan.
---
## Extended Thinking
**Adaptive thinking is the recommended mode for Claude 4.6+ models.** Claude decides dynamically when and how much to think.
```php
use Anthropic\Messages\ThinkingBlock;
$message = $client->messages->create(
model: 'claude-opus-4-8',
maxTokens: 16000,
thinking: ['type' => 'adaptive'],
messages: [
['role' => 'user', 'content' => 'Solve: 27 * 453'],
],
);
// ThinkingBlock(s) precede TextBlock in content
foreach ($message->content as $block) {
if ($block instanceof ThinkingBlock) {
echo "Thinking:\n{$block->thinking}\n\n";
// $block->signature is an opaque string — preserve verbatim if
// passing thinking blocks back in multi-turn conversations
} elseif ($block->type === 'text') {
echo "Answer: {$block->text}\n";
}
}
```
> **Deprecated:** `['type' => 'enabled', 'budgetTokens' => N]` (fixed-budget extended thinking) still works on Claude 4.6 but is deprecated. Use adaptive thinking above.
`$block->type === 'thinking'` also works for the check; `instanceof` narrows for PHPStan.
---
## Prompt Caching
`system:` takes an array of text blocks; set `cacheControl` on the last block. Array-shape syntax (camelCase keys) is idiomatic. For placement patterns and the silent-invalidator audit checklist, see `shared/prompt-caching.md`.
```php
$message = $client->messages->create(
model: 'claude-opus-4-8',
maxTokens: 16000,
system: [
['type' => 'text', 'text' => $longSystemPrompt, 'cacheControl' => ['type' => 'ephemeral']],
],
messages: [['role' => 'user', 'content' => 'Summarize the key points']],
);
```
For 1-hour TTL: `'cacheControl' => ['type' => 'ephemeral', 'ttl' => '1h']`. There's also a top-level `cacheControl:` on `messages->create(...)` that auto-places on the last cacheable block.
Verify hits via `$message->usage->cacheCreationInputTokens` / `$message->usage->cacheReadInputTokens`.
---
## Structured Outputs
@@ -351,7 +186,7 @@ foreach ($message->content as $block) {
---
## Beta Features & Server-Side Tools
## Beta Features & Anthropic-Defined Tools
**`betas:` is NOT a param on `$client->messages->create()`** — it only exists on the beta namespace. Use it for features that need an explicit opt-in header:
@@ -372,31 +207,47 @@ $response = $client->beta->messages->create(
);
```
**Server-side tools** (bash, web_search, text_editor, code_execution) are GA and work on both paths — `Anthropic\Messages\ToolBash20250124` / `WebSearchTool20260209` / `ToolTextEditor20250728` / `CodeExecutionTool20260120` for non-beta, `Anthropic\Beta\Messages\BetaToolBash20250124` / `BetaWebSearchTool20260209` / `BetaToolTextEditor20250728` / `BetaCodeExecutionTool20260120` for beta. No `betas:` header needed for these.
### Task budgets
```php
$response = $client->beta->messages->create(
model: 'claude-opus-4-8',
maxTokens: 16000,
outputConfig: ['taskBudget' => ['type' => 'tokens', 'total' => 64000]],
tools: [...],
messages: [...],
betas: ['task-budgets-2026-03-13'],
);
```
### Cache diagnostics
Pass the previous response's `id` on the next request; print the `diagnostics` object on the response:
```php
$r2 = $client->beta->messages->create(
model: 'claude-opus-4-8', maxTokens: 1024,
diagnostics: ['previousMessageId' => $r1->id],
betas: ['cache-diagnosis-2026-04-07'],
messages: [...],
);
```
**Anthropic-defined tools** (bash, web_search, text_editor, code_execution) are GA and work on both paths. Of these, web_search and code_execution are server-executed; bash and text_editor are client-executed (you handle the `tool_use` locally) — `Anthropic\Messages\ToolBash20250124` / `WebSearchTool20260209` / `ToolTextEditor20250728` / `CodeExecutionTool20260120` for non-beta, `Anthropic\Beta\Messages\BetaToolBash20250124` / `BetaWebSearchTool20260209` / `BetaToolTextEditor20250728` / `BetaCodeExecutionTool20260120` for beta. No `betas:` header needed for these.
### Tool search (non-beta, server-side)
```php
tools: [
['type' => 'tool_search_tool_regex_20251119', 'name' => 'tool_search_tool_regex'],
['name' => 'get_weather', 'description' => '...', 'inputSchema' => [...], 'deferLoading' => true],
// ... other user tools with 'deferLoading' => true
],
```
### Memory tool (non-beta, client-executed)
Declare `['type' => 'memory_20250818', 'name' => 'memory']`. Handle the `tool_use` by reading/writing files under a fixed `/memories` directory. **Validate every model-supplied path**: resolve to its canonical form and verify it remains within the memory directory; reject traversal (`..`, symlinks) — see `shared/tool-use-concepts.md` § Client-Side Tools.
---
## Stop Details
When `stopReason` is `'refusal'`, the response includes structured `stopDetails`:
```php
if ($message->stopReason === 'refusal' && $message->stopDetails !== null) {
echo "Category: " . $message->stopDetails->category . "\n"; // "cyber" | "bio" | null
echo "Explanation: " . $message->stopDetails->explanation . "\n";
}
```
---
## Error Type
`APIStatusException` exposes a `->type` property for programmatic error classification:
```php
try {
$client->messages->create(...);
} catch (\Anthropic\Core\Exceptions\APIStatusException $e) {
echo $e->type?->value; // "rate_limit_error", "overloaded_error", etc.
}
```

View File

@@ -7,7 +7,7 @@
## Installation
```bash
composer require "anthropic-ai/sdk"
composer require "anthropic-ai/sdk" "guzzlehttp/guzzle:^7"
```
## Client Initialization
@@ -263,7 +263,14 @@ $client->beta->sessions->resources->delete($resource->id, sessionID: $session->i
## List and Download Session Files
> Listing and downloading files an agent wrote during a session is not yet documented for PHP in this skill or in the apps source examples. See `shared/managed-agents-events.md` and the `anthropic-ai/sdk` PHP repository for the file list/download bindings.
```php
$files = $client->beta->files->list(
scopeID: 'sesn_abc123',
betas: ['managed-agents-2026-04-01'],
);
$content = $client->beta->files->download($files->data[0]->id);
file_put_contents('output.txt', $content);
```
---
@@ -293,7 +300,7 @@ $client->beta->sessions->delete($session->id);
```php
use Anthropic\Beta\Agents\BetaManagedAgentsAgentToolset20260401Params;
use Anthropic\Beta\Agents\BetaManagedAgentsMCPToolsetParams;
use Anthropic\Beta\Agents\BetaManagedAgentsUrlmcpServerParams;
use Anthropic\Beta\Agents\BetaManagedAgentsURLMCPServerParams;
use Anthropic\Beta\Sessions\BetaManagedAgentsAgentParams;
// Agent declares MCP server (no auth here — auth goes in a vault)
@@ -301,7 +308,7 @@ $agent = $client->beta->agents->create(
name: 'GitHub Assistant',
model: 'claude-opus-4-8',
mcpServers: [
BetaManagedAgentsUrlmcpServerParams::with(
BetaManagedAgentsURLMCPServerParams::with(
type: 'url',
name: 'github',
url: 'https://api.githubcopilot.com/mcp/',
@@ -395,8 +402,8 @@ $session = $client->beta->sessions->create(
[
'type' => 'github_repository',
'url' => 'https://github.com/org/repo',
'mountPath' => '/workspace/repo',
'authorizationToken' => 'ghp_your_github_token',
'mount_path' => '/workspace/repo',
'authorization_token' => 'ghp_your_github_token',
],
],
);
@@ -409,14 +416,14 @@ $resources = [
[
'type' => 'github_repository',
'url' => 'https://github.com/org/frontend',
'mountPath' => '/workspace/frontend',
'authorizationToken' => 'ghp_your_github_token',
'mount_path' => '/workspace/frontend',
'authorization_token' => 'ghp_your_github_token',
],
[
'type' => 'github_repository',
'url' => 'https://github.com/org/backend',
'mountPath' => '/workspace/backend',
'authorizationToken' => 'ghp_your_github_token',
'mount_path' => '/workspace/backend',
'authorization_token' => 'ghp_your_github_token',
],
];
```

View File

@@ -116,9 +116,9 @@ response = client.messages.create(
)
```
### Mid-conversation system messages (beta, model-gated)
### Mid-conversation system messages (model-gated)
For operator instructions that arrive mid-conversation (mode switches, injected state), append `{"role": "system", ...}` to `messages` instead of editing top-level `system` — this preserves the cached prefix and carries operator authority. Must follow a user message; cannot be `messages[0]`. Unsupported models return a 400 (`role 'system' is not supported on this model`). See `shared/prompt-caching.md` for when to use this vs. top-level `system`.
For operator instructions that arrive mid-conversation (mode switches, injected state), append `{"role": "system", ...}` to `messages` instead of editing top-level `system` — this preserves the cached prefix and carries operator authority. Must follow a user message (or an `assistant` message ending in server-tool use), and must be either the last entry in `messages` or be followed by an `assistant` turn; cannot be `messages[0]`. Unsupported models return a 400 (`role 'system' is not supported on this model`). See `shared/prompt-caching.md` for when to use this vs. top-level `system`.
```python
response = client.messages.create(
@@ -129,8 +129,7 @@ response = client.messages.create(
{"role": "user", "content": user_message},
{"role": "system", "content": "Terse mode enabled — keep responses under 40 words."},
],
extra_headers={"anthropic-beta": "mid-conversation-system-2026-04-07"},
)
) # No beta header needed — use regular client.messages.create
```
---
@@ -258,7 +257,7 @@ If `cache_read_input_tokens` is zero across repeated identical-prefix requests,
response = client.messages.create(
model="claude-opus-4-8",
max_tokens=16000,
thinking={"type": "adaptive"},
thinking={"type": "adaptive", "display": "summarized"}, # display opt-in: default is omitted (empty thinking text) on Fable 5 / Mythos 5 / Opus 4.8 / 4.7
output_config={"effort": "high"}, # low | medium | high | max
messages=[{"role": "user", "content": "Solve this step by step..."}]
)
@@ -375,7 +374,7 @@ response2 = conversation.send("What's my name?") # Claude remembers "Alice"
- Consecutive same-role messages are allowed — the API combines them into a single turn
- First message must be `user`
- `role: "system"` messages are allowed mid-conversation under the `mid-conversation-system-2026-04-07` beta on supporting models — see § Mid-conversation system messages above
- `role: "system"` messages are allowed mid-conversation on supporting models (no beta header needed) — see § Mid-conversation system messages above
---
@@ -434,10 +433,39 @@ When `stop_reason` is `"refusal"`, the response includes a `stop_details` object
```python
if response.stop_reason == "refusal" and response.stop_details:
print(f"Category: {response.stop_details.category}") # "cyber" | "bio" | None
print(f"Category: {response.stop_details.category}") # e.g. "cyber", "bio", "reasoning_extraction", "frontier_llm", or None — see docs for the full set
print(f"Explanation: {response.stop_details.explanation}")
```
### Refusal Fallbacks (Claude Fable 5) — opt in by default
Fallbacks are **opt-in**: without them a refused request simply stops. Include the server-side `fallbacks` parameter in `claude-fable-5` code by default — on a policy decline the API re-runs the same request on the fallback model inside the same call. A decline before any output isn't billed (a mid-stream decline bills the streamed partial); the rescue bills at the fallback model's own rates, with cache repricing applied automatically.
```python
response = client.beta.messages.create(
model="claude-fable-5",
max_tokens=16000,
betas=["server-side-fallback-2026-06-01"],
fallbacks=[{"model": "claude-opus-4-8"}],
messages=[{"role": "user", "content": "..."}],
)
# Switch points: one fallback block per model that ran and declined this turn
for block in response.content:
if block.type == "fallback":
print(f"{block.from_.model} declined; {block.to.model} continued")
# Served-by signal — covers sticky turns, which carry no fallback block.
# Pair with stop_reason: the fallback model can itself refuse.
fallback_ran = any(
entry.type == "fallback_message" for entry in response.usage.iterations or []
)
if fallback_ran and response.stop_reason != "refusal":
print(f"Served by {response.model}")
```
A `stop_reason: "refusal"` on the final response means the whole chain refused. The header must be exactly `server-side-fallback-2026-06-01`; the parameter is rejected on the Batches API and unavailable on Amazon Bedrock, Vertex AI, and Microsoft Foundry — register the client-side `BetaRefusalFallbackMiddleware` on the client there instead. Full semantics (sticky routing, billing, streaming, echoing fallback turns back): `shared/model-migration.md` → Migrating to Claude Fable 5 → `refusal` stop reason.
---
## Cost Optimization Strategies

View File

@@ -52,7 +52,7 @@ Claude may return text, thinking blocks, or tool use. Handle each appropriately:
with client.messages.stream(
model="claude-opus-4-8",
max_tokens=64000,
thinking={"type": "adaptive"},
thinking={"type": "adaptive", "display": "summarized"}, # display opt-in: default is omitted (empty thinking text) on Fable 5 / Mythos 5 / Opus 4.8 / 4.7
messages=[{"role": "user", "content": "Analyze this problem"}]
) as stream:
for event in stream:

View File

@@ -42,55 +42,27 @@ end
---
## Streaming
## Extended Thinking
> **Fable 5, Opus 4.8, Opus 4.7, Opus 4.6, and Sonnet 4.6:** Use adaptive thinking. `budget_tokens` is removed on Fable 5, Opus 4.8, and 4.7 (400 if sent); deprecated on Opus 4.6 and Sonnet 4.6.
> **Older models:** Use `thinking: { type: "enabled", budget_tokens: N }` (must be < `max_tokens`, min 1024).
```ruby
stream = client.messages.stream(
model: :"claude-opus-4-8",
max_tokens: 64000,
messages: [{ role: "user", content: "Write a haiku" }]
)
stream.text.each { |text| print(text) }
```
---
## Tool Use
The Ruby SDK supports tool use via raw JSON schema definitions and also provides a beta tool runner for automatic tool execution.
### Tool Runner (Beta)
```ruby
class GetWeatherInput < Anthropic::BaseModel
required :location, String, doc: "City and state, e.g. San Francisco, CA"
end
class GetWeather < Anthropic::BaseTool
doc "Get the current weather for a location"
input_schema GetWeatherInput
def call(input)
"The weather in #{input.location} is sunny and 72°F."
end
end
client.beta.messages.tool_runner(
message = client.messages.create(
model: :"claude-opus-4-8",
max_tokens: 16000,
tools: [GetWeather.new],
messages: [{ role: "user", content: "What's the weather in San Francisco?" }]
).each_message do |message|
puts message.content
thinking: { type: "adaptive" },
messages: [{ role: "user", content: "Solve: 27 * 453" }]
)
message.content.each do |block|
case block.type
when :thinking then puts "Thinking: #{block.thinking}"
when :text then puts "Response: #{block.text}"
end
end
```
### Manual Loop
See the [shared tool use concepts](../shared/tool-use-concepts.md) for the tool definition format and agentic loop pattern.
---
## Prompt Caching
@@ -120,11 +92,32 @@ When `stop_reason` is `:refusal`, the response includes structured `stop_details
```ruby
if message.stop_reason == :refusal && message.stop_details
puts "Category: #{message.stop_details.category}" # :cyber, :bio, or nil
puts "Category: #{message.stop_details.category}" # e.g. :cyber, :bio, :reasoning_extraction, :frontier_llm, or nil — see docs for the full set
puts "Explanation: #{message.stop_details.explanation}"
end
```
**Refusal fallbacks (Claude Fable 5) — opt in by default.** Fallbacks are opt-in: without them a refused request simply stops. New `claude-fable-5` code should include the server-side `fallbacks` parameter (beta header `server-side-fallback-2026-06-01`, `fallbacks: [{model: "claude-opus-4-8"}]` on the beta messages call) by default. The exact Ruby binding (and the client-side middleware for providers without server-side support) is not documented here — WebFetch the Ruby SDK repo's `examples/` from `shared/live-sources.md`; full semantics in `shared/model-migration.md` → Migrating to Claude Fable 5 → `refusal` stop reason.
---
## Beta Features
`betas:` is only valid on `client.beta.messages.create`, not the non-beta path.
### Task budgets
```ruby
response = client.beta.messages.create(
model: :"claude-opus-4-8",
max_tokens: 16000,
output_config: { task_budget: { type: :tokens, total: 64_000 } },
tools: [...],
messages: [...],
betas: ["task-budgets-2026-03-13"]
)
```
---
## Error Type
@@ -134,7 +127,7 @@ end
```ruby
begin
client.messages.create(...)
rescue Anthropic::APIStatusError => e
rescue Anthropic::Errors::APIStatusError => e
puts e.type # :rate_limit_error, :overloaded_error, etc.
end
```

View File

@@ -0,0 +1,16 @@
# Streaming — Ruby
## Streaming
```ruby
stream = client.messages.stream(
model: :"claude-opus-4-8",
max_tokens: 64000,
messages: [{ role: "user", content: "Write a haiku" }]
)
stream.text.each { |text| print(text) }
```
---

View File

@@ -0,0 +1,41 @@
# Tool Use — Ruby
For conceptual overview (tool definitions, tool choice, tips), see [shared/tool-use-concepts.md](../../shared/tool-use-concepts.md).
## Tool Use
The Ruby SDK supports tool use via raw JSON schema definitions and also provides a beta tool runner for automatic tool execution.
### Tool Runner (Beta)
```ruby
class GetWeatherInput < Anthropic::BaseModel
required :location, String, doc: "City and state, e.g. San Francisco, CA"
end
class GetWeather < Anthropic::BaseTool
doc "Get the current weather for a location"
input_schema GetWeatherInput
def call(input)
"The weather in #{input.location} is sunny and 72°F."
end
end
client.beta.messages.tool_runner(
model: :"claude-opus-4-8",
max_tokens: 16000,
tools: [GetWeather.new],
messages: [{ role: "user", content: "What's the weather in San Francisco?" }]
).each_message do |message|
puts message.content
end
```
### Manual Loop
See the [shared tool use concepts](../../shared/tool-use-concepts.md) for the tool definition format and agentic loop pattern.
---

View File

@@ -229,7 +229,11 @@ client.beta.sessions.resources.delete(resource.id, session_id: session.id)
## List and Download Session Files
> Listing and downloading files an agent wrote during a session is not yet documented for Ruby in this skill or in the apps source examples. See `shared/managed-agents-events.md` and the `anthropic` Ruby gem repository for the file list/download bindings.
```ruby
files = client.beta.files.list(scope_id: "sesn_abc123", betas: ["managed-agents-2026-04-01"])
content = client.beta.files.download(files.data[0].id)
File.binwrite("output.txt", content.read)
```
---

View File

@@ -90,7 +90,7 @@ Both patterns keep the fixed context small and load detail on demand.
| Constraint (from `prompt-caching.md`) | Agent-specific workaround |
| --- | --- |
| Editing the system prompt mid-session invalidates the cache. | Append a `{"role": "system", ...}` message to `messages[]` instead (beta, on supporting models — see `prompt-caching.md` § Mid-conversation system messages). The cached prefix stays intact, and the model treats it as an operator-authority instruction rather than user text. On models that don't support it, fall back to a `<system-reminder>` text block in the user turn. |
| Editing the system prompt mid-session invalidates the cache. | Append a `{"role": "system", ...}` message to `messages[]` instead (no beta header; on supporting models — see `prompt-caching.md` § Mid-conversation system messages). The cached prefix stays intact, and the model treats it as an operator-authority instruction rather than user text. On models that don't support it, fall back to a `<system-reminder>` text block in the user turn. |
| Switching models mid-session invalidates the cache. | Spawn a **subagent** with the cheaper model for the sub-task; keep the main loop on one model. Claude Code's Explore subagents use Haiku this way. |
| Adding/removing tools mid-session invalidates the cache. | Use **tool search** for dynamic discovery — it appends tool schemas rather than swapping them, so the existing prefix is preserved. |

View File

@@ -1,6 +1,6 @@
# Claude Platform on AWS
**Anthropic-operated** access to the Claude Developer Platform through AWS infrastructure — SigV4 authentication, AWS IAM access control, and AWS Marketplace billing. Because Anthropic operates it, **the API surface matches first-party with same-day parity**: Managed Agents, server-side tools, batches, Files, and every feature in this skill work the same way (**except self-hosted sandboxes** — `config:{type:"self_hosted"}` is not available here; use `cloud`). Model IDs are the bare first-party strings (`claude-opus-4-8`, `claude-sonnet-4-6`) — **no provider prefix**.
**Anthropic-operated** access to the Claude Developer Platform through AWS infrastructure — SigV4 authentication, AWS IAM access control, and AWS Marketplace billing. Because Anthropic operates it, **the API surface matches first-party with same-day parity** — for per-feature exceptions, see `shared/platform-availability.md` (the single source of truth; do not rely on an inline exception list here). Model IDs are the bare first-party strings (`claude-opus-4-8`, `claude-sonnet-4-6`) — **no provider prefix**.
> **Not the same as Amazon Bedrock.** Bedrock is partner-operated (AWS runs the service; release schedules vary, feature subset, `anthropic.`-prefixed model IDs). Claude Platform on AWS and Bedrock coexist; pick by whether you need AWS-native IAM/billing with full Anthropic API parity (this page) vs. Bedrock's own ecosystem.

View File

@@ -183,42 +183,65 @@ thinking: budget_tokens=10000, max_tokens=16000
## Typed Exceptions in SDKs
**Always use the SDK's typed exception classes** instead of checking error messages with string matching. Each HTTP error code maps to a specific exception class:
**Always use the SDK's typed exception classes** instead of checking error messages with string matching. Each HTTP status code maps to a specific exception class per SDK.
| HTTP Code | TypeScript Class | Python Class |
| --------- | --------------------------------- | --------------------------------- |
| 400 | `Anthropic.BadRequestError` | `anthropic.BadRequestError` |
| 401 | `Anthropic.AuthenticationError` | `anthropic.AuthenticationError` |
| 403 | `Anthropic.PermissionDeniedError` | `anthropic.PermissionDeniedError` |
| 404 | `Anthropic.NotFoundError` | `anthropic.NotFoundError` |
| 413 | `Anthropic.RequestTooLargeError` | `anthropic.RequestTooLargeError` |
| 429 | `Anthropic.RateLimitError` | `anthropic.RateLimitError` |
| 500+ | `Anthropic.InternalServerError` | `anthropic.InternalServerError` |
| 529 | `Anthropic.OverloadedError` | `anthropic.OverloadedError` |
| Any | `Anthropic.APIError` | `anthropic.APIError` |
### Exception class names by language
```typescript
// ✅ Correct: use typed exceptions
try {
const response = await client.messages.create({...});
} catch (error) {
if (error instanceof Anthropic.RateLimitError) {
// Handle rate limiting
} else if (error instanceof Anthropic.APIError) {
console.error(`API error ${error.status}:`, error.message);
}
}
| HTTP | Python (`anthropic.*`) / TypeScript (`Anthropic.*`) | Ruby (`Anthropic::Errors::*`) | Java (`com.anthropic.errors.*`) | C# | PHP (`Anthropic\Core\Exceptions\*`) |
|---|---|---|---|---|---|
| 400 | `BadRequestError` | `BadRequestError` | `BadRequestException` | `AnthropicBadRequestException` | `BadRequestException` |
| 401 | `AuthenticationError` | `AuthenticationError` | `UnauthorizedException` | `AnthropicUnauthorizedException` | `AuthenticationException` |
| 403 | `PermissionDeniedError` | `PermissionDeniedError` | `PermissionDeniedException` | `AnthropicForbiddenException` | `PermissionDeniedException` |
| 404 | `NotFoundError` | `NotFoundError` | `NotFoundException` | `AnthropicNotFoundException` | `NotFoundException` |
| 422 | `UnprocessableEntityError` | `UnprocessableEntityError` | `UnprocessableEntityException` | `AnthropicUnprocessableEntityException` | `UnprocessableEntityException` |
| 429 | `RateLimitError` | `RateLimitError` | `RateLimitException` | `AnthropicRateLimitException` | `RateLimitException` |
| ≥500 | `InternalServerError` | `InternalServerError` | `InternalServerException` | `Anthropic5xxException` | `InternalServerException` |
| net | `APIConnectionError` | `APIConnectionError` | `AnthropicIoException` | `AnthropicIOException` | `APIConnectionException` |
| base | `APIError` (both); `APIStatusError` (Python only) | `APIStatusError` / `APIError` | `AnthropicServiceException` | `AnthropicApiException` | `APIStatusException` / `APIException` |
// ❌ Wrong: don't check error messages with string matching
try {
const response = await client.messages.create({...});
} catch (error) {
const msg = error instanceof Error ? error.message : String(error);
if (msg.includes("429") || msg.includes("rate_limit")) { ... }
}
The Ruby and PHP classes live in a dedicated errors namespace — write `Anthropic::Errors::RateLimitError` and `Anthropic\Core\Exceptions\RateLimitException` (not bare `Anthropic::RateLimitError`). All 4xx C# exceptions also inherit from `Anthropic4xxException`.
### Catch most-specific first, in a chain
Order `catch`/`except`/`rescue` clauses from the most specific subclass to the base class, with a separate clause for each category you handle differently — retryable (429, ≥500, network) vs. non-retryable (4xx). The SDK defines a distinct class per status for exactly this reason; a single broad catch-all discards that information.
```python
try:
msg = client.messages.create(...)
except anthropic.NotFoundError as e: # 404 — e.g. bad model ID
...
except anthropic.RateLimitError as e: # 429 — back off and retry
...
except anthropic.APIStatusError as e: # any other non-2xx HTTP response
print(e.status_code, e.message)
except anthropic.APIConnectionError as e: # network failure before a response
...
```
All exception classes extend `Anthropic.APIError`, which has a `status` property. Use `instanceof` checks from most specific to least specific (e.g., check `RateLimitError` before `APIError`).
The same chain shape applies in every SDK: TypeScript `instanceof Anthropic.NotFoundError``RateLimitError``APIConnectionError``APIError` (check `APIConnectionError` before `APIError` — in the TypeScript SDK it's a subclass of `APIError`, unlike Python where it's a sibling); Ruby `rescue Anthropic::Errors::NotFoundError``…::RateLimitError``…::APIStatusError`; Java `catch (NotFoundException) … catch (RateLimitException) … catch (AnthropicServiceException)`; C# `catch (AnthropicNotFoundException) … catch (AnthropicRateLimitException) … catch (AnthropicApiException)`; PHP `catch (NotFoundException) … catch (RateLimitException) … catch (APIStatusException)`.
### Go — `errors.As` then branch on status
The Go SDK returns a single `*anthropic.Error` for all non-2xx responses. Unwrap it with `errors.As`, then branch on `StatusCode`:
```go
_, err := client.Messages.New(ctx, params)
if err != nil {
var apierr *anthropic.Error
if errors.As(err, &apierr) {
switch apierr.StatusCode {
case 404:
// bad model ID / resource
case 429:
// back off and retry
default:
// other API error — apierr.StatusCode, apierr.RequestID
}
} else {
// transport-level error (*url.Error wrapping *net.OpError, etc.)
}
}
```
### Error `.type` Field

View File

@@ -42,7 +42,7 @@ All resources are under the `beta` namespace. Python and TypeScript share identi
**Agent shorthand:** `agent` on session create accepts either a bare string (`agent="agent_abc123"` — uses latest version) or the full reference object (`{type: "agent", id: "agent_abc123", version: 123}`).
**Model shorthand:** `model` on agent create accepts either a bare string (`model="claude-opus-4-8"` — uses `standard` speed) or the full config object (`{id: "claude-opus-4-6", speed: "fast"}`). Note: `speed: "fast"` is only supported on Opus 4.6.
**Model shorthand:** `model` on agent create accepts either a bare string (`model="claude-opus-4-8"` — uses `standard` speed) or the full config object (`{id: "claude-opus-4-8", speed: "fast"}`). Note: `speed: "fast"` is supported only on Opus 4.8 and Opus 4.7. Opus 4.7 fast mode is deprecated; after removal, `speed: "fast"` on Opus 4.7 returns an error. Opus 4.8 is the durable fast-capable tier.
---

View File

@@ -2,7 +2,7 @@
Patterns you'll write on the client side when driving a Managed Agent session, grounded in working SDK examples.
Code samples are TypeScript — Python and cURL follow the same shape; see `python/managed-agents/README.md` and `curl/managed-agents.md` for equivalents.
Code samples are TypeScript — other languages follow the same shape; see `{lang}/managed-agents/README.md` (cURL and C#: `curl/managed-agents.md`) for equivalents.
---

View File

@@ -2,143 +2,81 @@
> **Invoked via `/claude-api managed-agents-onboard`?** You're in the right place. Run the interview below — don't summarize it back to the user, ask the questions.
Use this when a user wants to set up a Managed Agent from scratch: **branch on know-vs-explore → configure the template → set up the session → pre-flight viability check → emit working code.** The pre-flight check (§3) is not optional — a setup missing a tool, credential, or data access it needs will fail mid-run, and the gap is usually visible at setup time.
Claude Managed Agents is a hosted agent: Anthropic runs the agent loop and provisions a sandboxed container per session where the agent's tools execute (or your own worker, with a `self_hosted` environment — see `shared/managed-agents-self-hosted-sandboxes.md`). You supply an **agent config** (tools, skills, model, system prompt — reusable, versioned) and an **environment config** (the sandbox — reusable across agents). Each run is a **session**.
> Read `shared/managed-agents-core.md` alongside this — it has full detail for each knob. This doc is the interview script, not the reference.
The flow is four beats — **describe → agent → environment → session** — the same arc as the Console quickstart, and the same philosophy: **value before credentials**. The user goes from idea to a runnable session before any auth ask; each credential is *flagged* at the moment the design makes it relevant (§2) and *collected* once, at session setup (§4), where it binds (`sessions.create()`) and gets exercised (smoke-test). Read `shared/managed-agents-core.md` alongside this — it has full detail for each knob; this doc is the interview script.
---
Claude Managed Agents is a hosted agent: Anthropic runs the agent loop on its orchestration layer and provisions a sandboxed container per session where the agent's tools execute (or, with a `self_hosted` environment, your own worker runs the tools — see `shared/managed-agents-self-hosted-sandboxes.md`). You supply the agent config and the environment config; the harness — event stream, sandbox orchestration, prompt caching, context compaction, and extended thinking — is handled for you.
## 1. Describe the task
**What you supply:**
- **An agent config** — tools, skills, model, system prompt. Reusable and versioned.
- **An environment config** — the sandbox your agent's tools execute in (`cloud`: networking, packages; or `self_hosted`: your own infra). Reusable across agents.
**Open with a one-breath signpost and a single open prompt — don't guess, don't questionnaire.** In your own words:
Each run of the agent is a **session**.
> Managed Agents is hosted — Anthropic runs the agent loop, the sandbox, and the infrastructure; you just define the agent. We'll do this in three moves: the agent, the environment it runs in, then a live test session. So: describe the agent you want — what should it do, and what kicks it off (a person, an event, a schedule)?
---
Let them answer in full before configuring anything.
## 1. Know or explore?
## 2. Configure the agent — propose, don't interrogate
Ask the user:
Their description does the interview's work. Draft the agent config from it and **present it as a proposal with your suggestions inline** — the user reacts to a concrete config instead of answering a question list. At most one batched follow-up for true gaps. Suggest where the description gives you an opening:
> Do you already know the agent you want to build, or would you like to explore some common patterns first?
- **Tools** — enable the full prebuilt toolset by default (`agent_toolset_20260401`: `bash`, `read`, `write`, `edit`, `glob`, `grep`, `web_fetch`, `web_search`). **Suggest MCP servers** for any third-party service the job names (GitHub, Linear, Slack, …) — and flag the credential each one implies as you suggest it ("Linear MCP → you'll need a Linear API token at kickoff"), so §4's auth step is a formality, not a surprise. Collection itself waits for §4. Custom tools only if the user's own app must answer calls (name, description, input schema — their handler code is theirs; don't generate it).
- **Skills** — **suggest** prebuilt `xlsx`/`docx`/`pptx`/`pdf` when the job produces those artifacts; custom by `skill_id` (max 20 total per agent, prebuilt + custom combined).
- **Outcome** — if the description implies checkable "done" criteria (or you can elicit them in the follow-up: not "a good report" but "a CSV with a numeric `price` column per SKU"), **suggest an Outcome kickoff** — the harness grades and iterates against a rubric (`shared/managed-agents-outcomes.md`).
- **On-hand resources** — repos on disk (`github_repository`: URL, optional `mount_path`/`checkout`; token comes in §4), files to seed (Files API upload → `{type: "file", file_id, mount_path}`; read-only), if the job references them.
- **Model** — default `claude-opus-4-8`; `claude-fable-5` for the hardest long-horizon work (`shared/model-migration.md` → Migrating to Claude Fable 5).
### Explore path — show the patterns
> ‼️ **PR creation needs the GitHub MCP server too** — a `github_repository` mount is filesystem-only. Edit in the mount → push branch via `bash` → open the PR via the MCP `create_pull_request` tool.
Four shapes, same runtime code path (`sessions.create()``sessions.events.send()` → stream). Only the trigger and sink differ.
Full detail per knob: `shared/managed-agents-tools.md` (toolset, MCP, custom tools, skills), `shared/managed-agents-environments.md` (repos, files).
| Pattern | Trigger | Example |
|---|---|---|
| Event-triggered | Webhook | GitHub PR push → CMA (GitHub tool) → Slack |
| Scheduled | Cron | Daily brief: browser + GitHub + Jira → CMA → Slack |
| Fire-and-forget PR | Human | Slack slash-command → CMA (GitHub tool) → PR passing CI |
| Research + dashboard | Human | Topic → CMA (web search + `frontend-design` skill) → HTML dashboard |
## 3. Environment
Ask which shape fits, then continue with the Know path using it as the reference.
Usually zero or one question:
### Know path — configure template
- **Reuse or create?** Environments are shared across agents — check for an existing one first.
- **Networking** — default unrestricted egress. Switch to `limited` only if the user wants egress control — then set `allow_mcp_servers: true` or list every MCP server domain in `allowed_hosts`, or those tools fail silently.
- **Suggest `self_hosted`** when the signals are there: tools must run on their own infra, secrets can't leave it, or they need binaries/data the cloud container won't have (`shared/managed-agents-self-hosted-sandboxes.md`; not available on Claude Platform on AWS). Otherwise `cloud` — don't raise it unprompted for simple jobs.
Three rounds. Batch the questions in each round; don't ask them one at a time.
## 4. Session — auth, then test run
**Round A — Tools.** Start here; it's the most concrete part. Three types; ask which the user wants (any combination):
**Auth happens here — collect the credentials flagged in §2, now that the config is settled:** a vault (existing or `vaults.create()`) + `vaults.credentials.create()` for each MCP server declared in §2, `environment_variable` credentials for API keys the job uses (substituted at egress; the sandbox sees a placeholder), and the `authorization_token` for each repo mount. Credentials are write-only; MCP credentials match servers by URL and auto-refresh. See `shared/managed-agents-tools.md` → Vaults.
| Type | What it is | How to guide |
|---|---|---|
| **Prebuilt Claude Agent tools** (`agent_toolset_20260401`) | Ready-to-use: `bash`, `read`, `write`, `edit`, `glob`, `grep`, `web_fetch`, `web_search`. Enable all at once, or individually via `enabled: true/false`. | Recommend enabling the full toolset. List the 8 tools so the user knows what they're getting. Full detail: `shared/managed-agents-tools.md` → Agent Toolset. |
| **MCP tools** | Third-party integrations (GitHub, Linear, Asana, etc.) via `mcp_toolset`. Credentials live in a vault, not inline. | Ask which services. For each, walk through MCP server URL + vault credentials. Full detail: `shared/managed-agents-tools.md` → MCP Servers + Vaults. |
| **Custom tools** | The user's own app handles these tool calls — agent fires `agent.custom_tool_use`, the app sends a result message back. | Ask for each tool: name, description, input schema. The app code that handles the event is *their* code — don't generate it. Full detail: `shared/managed-agents-tools.md` → Custom Tools. |
**Silent viability gate — run this yourself before emitting anything; surface only the gaps.** Walk the job clause by clause: every verb maps to an enabled tool or MCP server ("open a PR" → GitHub MCP, not just the mount); every MCP server and repo mount has its credential from the auth step; every external host is reachable under the networking choice; every file/repo/dataset the job references is mounted; "done" is checkable. If something's missing, say so and resolve it — don't emit a config you already know is under-resourced.
**Round B — Skills, files, and repos.** What the agent has on hand when it starts.
**Kickoff — pick one, never both:**
- `user.message` — conversational.
- `user.define_outcome` + rubric — when §2 settled on an Outcome; the harness iterates and grades until the rubric passes.
- **Scheduled shape?** Skip per-session kickoff entirely — create a **deployment** (`deployments.create()` with `schedule` + `initial_events`); each firing creates the session autonomously. See `shared/managed-agents-scheduled-deployments.md`.
*Skills* — two types; both work the same way — Claude auto-uses them when relevant. Max 20 per agent.
- [ ] **Pre-built Agent Skills**: `xlsx`, `docx`, `pptx`, `pdf`. Reference by name.
- [ ] **Custom Skills**: skills uploaded to the user's org via the Skills API. Reference by `skill_id` + optional `version`. If the skill doesn't exist yet, walk the user through `POST /v1/skills` + `POST /v1/skills/{id}/versions` (beta header `skills-2025-10-02`). Full detail: `shared/managed-agents-tools.md` → Skills + Skills API.
Mechanics to bake into the runtime code: session creation blocks until resources mount (bad mounts surface there, before tokens); open the event stream *before* sending the kickoff; break on `session.status_terminated`, or `session.status_idle` with a terminal `stop_reason` — anything except `requires_action` (`shared/managed-agents-client-patterns.md` Pattern 5); usage lands on `span.model_request_end`; artifacts land in `/mnt/session/outputs/` (`files.list({scope_id: session.id, ...})`).
*GitHub repositories* — any repos the agent needs on-disk? For each:
- [ ] Repo URL (`https://github.com/org/repo`)
- [ ] `authorization_token` (PAT or GitHub App token scoped to the repo)
- [ ] Optional `mount_path` (defaults to `/workspace/<repo-name>`) and `checkout` (branch or SHA)
## 5. Integrate — emit the code
Emit as `resources: [{type: "github_repository", url, authorization_token, ...}]`. Full detail: `shared/managed-agents-environments.md` → GitHub Repositories.
Go straight from the last answer to the code — no preamble, no lecture about setup-vs-runtime; the two-block structure shows it. Generate **two clearly-separated blocks**:
> ‼️ **PR creation needs the GitHub MCP server too.** `github_repository` gives filesystem access only — to open PRs, also attach the GitHub MCP server in Round A and credential it via a vault. The workflow is: edit files in the mounted repo → push branch via `bash` → create PR via the MCP `create_pull_request` tool.
**Block 1 — Setup (run once, store the IDs).** Prefer **YAML files + `ant` CLI** — agents and environments are version-controlled definitions users should check in and apply from CI:
*Files* — any local files to seed the session with? For each:
- [ ] Upload via the Files API → persist `file_id`
- [ ] Choose a `mount_path` — absolute, e.g. `/workspace/data.csv` (parents auto-created; files mount read-only)
Emit as `resources: [{type: "file", file_id, mount_path}]`. Max 999 file resources. Agent working directory defaults to `/workspace`. Full detail: `shared/managed-agents-environments.md` → Files API.
**Round C — Identity, success criteria, environment:**
- [ ] Name?
- [ ] Job (one or two sentences — becomes the system prompt)?
- [ ] **What does "done" look like?** Push for concrete, checkable success criteria — not "a good report" but "a CSV with a numeric `price` column per SKU." Explicit criteria give the agent a clear target and let you verify the result; vague ones leave it guessing what "done" means. If they're gradeable, plan to wire an **Outcome** in §2 so the harness grades-and-revises against them. See `shared/managed-agents-outcomes.md`.
- [ ] Networking: unrestricted internet from the container, or lock egress to specific hosts? (If locked, MCP server domains must be in `allowed_hosts` or tools silently fail.)
- [ ] Model? (default `claude-opus-4-8`)
---
## 2. Set up the session
Per-run. Points at the agent + environment, attaches credentials, kicks off.
**Vault credentials** (if the agent declared MCP servers, or the job needs an API key for a CLI/SDK/direct API call):
- [ ] Existing vault, or create one? (`client.beta.vaults.create()` + `vaults.credentials.create()`)
Credentials are write-only. MCP credentials are matched to MCP servers by URL and auto-refreshed; `environment_variable` credentials are substituted into outbound requests at egress (the sandbox sees only a placeholder). See `shared/managed-agents-tools.md` → Vaults.
**Kickoff — pick one:**
- [ ] **Conversational:** a first `user.message` to the agent.
- [ ] **Outcome-graded** (recommended when §Round C produced checkable criteria): send a `user.define_outcome` with a rubric *instead of* a `user.message` — the harness iterates and grades against the rubric until satisfied. Don't send both. See `shared/managed-agents-outcomes.md`.
Session creation blocks until all resources mount. Open the event stream before sending the kickoff. Stream is SSE; break on `session.status_terminated`, or on `session.status_idle` with a terminal `stop_reason` — i.e. anything except `requires_action`, which fires transiently while the session waits on a tool confirmation or custom-tool result (see `shared/managed-agents-client-patterns.md` Pattern 5). Usage lands on `span.model_request_end`. Agent-written artifacts end up in `/mnt/session/outputs/` — download via `files.list({scope_id: session.id, betas: ["managed-agents-2026-04-01"]})`.
**Console escape hatch.** In the runtime block you emit, print the session's Console URL right after `sessions.create()` so the user can watch it in the UI while iterating: `print(f"Watch in Console: https://platform.claude.com/workspaces/default/sessions/{session.id}")` (swap `default` for the user's workspace slug if they named one).
---
## 3. Pre-flight viability check — reconcile the job against the resources
**Do this before emitting any code.** A common, avoidable failure is an under-resourced run: the ask is clear, but the agent is missing a tool, a credential, data access, or the context to act. The agent discovers the gap a few turns in, flails, and gives up — burning the budget to produce nothing. The gap is usually visible at setup time. Catch it here, not after the session fails.
Walk the stated job clause by clause. For each action the agent must take, confirm a resource covers it — and name the gap out loud if one doesn't:
| Gap class | Check | If missing |
|---|---|---|
| **Tool / integration** (most catchable upfront — config is statically inspectable) | Every verb in the job maps to an enabled tool or MCP server. "Triage tickets" → a ticketing MCP server; "open a PR" → GitHub MCP server (a `github_repository` mount alone can't open PRs); "search the web" → `web_search` enabled in the toolset. | Add the tool/MCP server in §Round A, or cut the ask from the job. |
| **Credential / access** | Every MCP server has a vault credential attached (§2). Every external host the job touches is reachable — networking `unrestricted`, or the host is in `allowed_hosts`. | Create/attach the vault; widen `allowed_hosts`. These don't fail until runtime — the smoke-test in §4 is how you surface them cheaply. |
| **Data** | Every file, dataset, or repo the job references is mounted as a `resource` (file, `github_repository`, or memory store). | Upload + mount it in §Round B, or tell the agent where to fetch it from. |
| **Prompt quality / criteria** | The job is specific enough to act on, and "done" is checkable (§Round C). | Tighten the job; wire an Outcome. |
State any unmet gaps to the user and resolve them before generating code. Don't emit a config you already know is under-resourced — an agent can't complete a task it lacks the tools, credentials, or data for.
---
## 4. Emit the code
Go straight from the last interview answer to the code — no preamble about the setup-vs-runtime split, no "the critical thing to internalize…", no lecture about `agents.create()` being one-time. The two-block structure below already shows that; don't narrate it. Generate **two clearly-separated blocks**:
**Block 1 — Setup (run once, store the IDs).** Prefer emitting this as **YAML files + `ant` CLI commands** — agents and environments are version-controlled definitions, and the CLI flow is what users should check into their repo and run from CI. Fall back to SDK code only if the user explicitly wants setup in-language or the `ant` CLI is unavailable.
Emit:
1. `<name>.agent.yaml` with everything from §Round AC (flat: `name`, `model`, `system`, `tools`, `mcp_servers`, `skills`)
2. `<name>.environment.yaml` with §Round C networking
3. The apply commands:
```sh
1. `<name>.agent.yaml` (flat: `name`, `model`, `system`, `tools`, `mcp_servers`, `skills`) and `<name>.environment.yaml`
2. ```sh
AGENT_ID=$(ant beta:agents create < <name>.agent.yaml --transform id -r)
ENV_ID=$(ant beta:environments create < <name>.environment.yaml --transform id -r)
# CI sync: ant beta:agents update --agent-id "$AGENT_ID" --version N < <name>.agent.yaml
```
See `shared/anthropic-cli.md` for the full CLI reference. If emitting SDK code instead, label it `# ONE-TIME SETUP — run once, save the IDs to config/.env` and call `environments.create()` → `agents.create()`.
SDK fallback if the user asks — and **required on Claude Platform on AWS**, where auth is SigV4 and the `ant` CLI has no SigV4 mode (use the platform client from `shared/claude-platform-on-aws.md`): label it `# ONE-TIME SETUP — run once, save the IDs` and call `environments.create()` → `agents.create()`.
**Block 2 — Runtime (run on every invocation).** This is SDK code in the detected language (Python/TS/cURL — see SKILL.md → Language Detection). The runtime path needs to react programmatically to events (tool confirmations, custom tool results, reconnect), which is SDK territory — don't emit shell loops here.
1. Load `env_id` + `agent_id` from config/env
2. `sessions.create(agent=AGENT_ID, environment_id=ENV_ID, resources=[...], vault_ids=[...])` — this blocks until resources mount, so a bad file/repo mount surfaces *here*, before any tokens are spent.
3. **Smoke-test first when the job depends on MCP servers, credentials, or reachable hosts.** Credential and MCP-connectivity failures don't surface at `sessions.create()` — only when the agent first tries to use them. Send one cheap probe turn ("Confirm you can reach <service> and list 12 items; don't start the task yet"), check it succeeded, *then* send the real kickoff. A few hundred tokens here beats a runaway session that flails on a missing credential and gives up. Skip for agents with no external dependencies.
4. Open stream, `events.send()` the kickoff (a `user.message`, or a `user.define_outcome` if §2 chose the outcome-graded path), loop until `session.status_terminated` or `session.status_idle && stop_reason.type !== 'requires_action'` (see `shared/managed-agents-client-patterns.md` Pattern 5 for the full gate — do not break on bare `session.status_idle`)
> ⚠️ **Deployments are newer than the rest of the MA surface.** Before emitting `ant beta:deployments …` or `client.beta.deployments` / `client.beta.deployment_runs` calls, verify the user's installed CLI/SDK exposes them (`ant beta:deployments --help`; `hasattr(client.beta, "deployments")`). If not, emit raw HTTP against `POST /v1/deployments` with the `managed-agents-2026-04-01` beta header (plus `oauth-2025-04-20` when authenticating with a Bearer token from `ant auth print-credentials`), and leave an upgrade note marking what simplifies to SDK calls.
> ⚠️ **Never emit `agents.create()` and `sessions.create()` in the same unguarded block.** That teaches the user to create a new agent on every run — the #1 anti-pattern. If they need a single script, wrap agent creation in `if not os.getenv("AGENT_ID"):`.
**Scheduled shape? The deployment is setup, not runtime.** Create it in Block 1, after the agent/environment IDs exist (`deployments.create()` with `schedule` + `initial_events`). Block 2 is then **not** a session loop — there is no per-run kickoff to send. Emit instead: a manual-run trigger (`POST /v1/deployments/{id}/run`) so the user can test now rather than wait for the first firing — the manual run doubles as the smoke test — plus a fetch helper (latest `deployment_runs` entry → `session_id` → Console URL + `files.list(scope_id=session_id)` for the artifacts).
Pull exact syntax from `python/managed-agents/README.md`, `typescript/managed-agents/README.md`, or `curl/managed-agents.md`. Don't invent field names.
**Block 2 — Runtime (every invocation; conversational and Outcome shapes).** SDK code in the detected language (Python/TS/cURL — SKILL.md → Language Detection); don't emit shell loops here:
1. Load `agent_id` + `env_id` from config/env
2. `sessions.create(agent=AGENT_ID, environment_id=ENV_ID, resources=[...], vault_ids=[...])`, then print the Console URL so the user can watch live: `https://platform.claude.com/workspaces/default/sessions/{session.id}` (swap `default` for their workspace slug)
3. **Smoke-test when the job depends on MCP servers, credentials, or locked-down hosts** — those failures don't surface at `sessions.create()`, only on first use. One cheap probe turn ("Confirm you can reach <service> and list 12 items; don't start the task"), verify, then send the real kickoff. Skip when there are no external dependencies.
4. Open stream → send the §4 kickoff → loop with the terminal gate from §4.
> ⚠️ **Never emit `agents.create()` and `sessions.create()` in the same unguarded block** — that teaches creating a new agent per run, the #1 anti-pattern. Single-script requests: wrap creation in `if not os.getenv("AGENT_ID"):`.
Pull exact syntax from `{lang}/managed-agents/README.md` for your detected language (cURL and C#: use `curl/managed-agents.md` as the wire-level reference). Don't invent field names.

View File

@@ -41,7 +41,7 @@ Managed Agents is in beta. The SDK sets required beta headers automatically:
| See the full endpoint reference | `shared/managed-agents-api-reference.md` |
| **Create an agent** (required first step) | `shared/managed-agents-core.md` (Agents section) + language file |
| Update/version an agent | `shared/managed-agents-core.md` (Agents → Versioning) — update, don't re-create |
| Create a session | `shared/managed-agents-core.md` + `{lang}/managed-agents/README.md` |
| Create a session | `shared/managed-agents-core.md` + `{lang}/managed-agents/README.md` (cURL/C#: `curl/managed-agents.md`) |
| Configure tools and permissions | `shared/managed-agents-tools.md` |
| Set up MCP servers | `shared/managed-agents-tools.md` (MCP Servers section) |
| Stream events / handle tool_use | `shared/managed-agents-events.md` + language file |

View File

@@ -19,7 +19,7 @@ For the latest, authoritative version (with code samples in every supported lang
| Opus 4.7 Migration Checklist | The required vs optional items for 4.7, tagged `[BLOCKS]` / `[TUNE]` |
| Migrating to Opus 4.8 | Migrating to Opus 4.8 (no new breaking changes; mid-session system prompts; behavioral re-tuning) |
| Opus 4.8 Migration Checklist | The required vs optional items for 4.8, tagged `[BLOCKS]` / `[TUNE]` |
| Migrating to Claude Fable 5 | Migrating to Claude Fable 5 or Claude Mythos 5 (always-on protected thinking, new tokenizer, refusal handling, data retention, behavioral shifts + prompting guidance) |
| Migrating to Claude Fable 5 | Migrating to Claude Fable 5 or Claude Mythos 5 (always-on thinking, raw chain of thought never returned, refusal handling, data retention, behavioral shifts + prompting guidance) |
| Claude Fable 5 Migration Checklist | The required vs optional items for Claude Fable 5, tagged `[BLOCKS]` / `[TUNE]` |
| Verify the Migration | After edits — runtime spot-check |
@@ -69,7 +69,7 @@ Not every file that contains the old model ID is a **caller** of the API. Before
| 1 | **Calls the API/SDK** | `client.messages.create(model=…)`, `anthropic.Anthropic()`, request payloads | Swap the model ID **and** apply the breaking-change checklist for the target version (below). |
| 2 | **Defines or serves the model** | Model registries, OpenAPI specs, routing/queue configs, model-policy enums, generated catalogs | The old entry **stays** (the model is still served). Ask whether to (a) add the new model alongside, (b) leave alone, or (c) retire the old model — never blind-replace. **If you can't ask, default to (a): add the new model alongside and flag it** — replacing would de-register a model that's still in production. |
| 3 | **References the ID as an opaque string** | UI fallback constants, capability-gate substring checks, generic test fixtures, label parsers, env defaults | Usually swap the string and verify any parser/regex/substring match handles the new ID — but check the sub-cases below first. |
| 4 | **Suffixed variant ID** | `claude-<model>-<suffix>` like `-fast`, `-1024k`, `-200k`, `[1m]`, dated snapshots | These are deployment/routing identifiers, not the public model ID. **Do not assume a new-model equivalent exists.** Verify in the registry first; if absent, leave the string alone and flag it. |
| 4 | **Suffixed variant ID** | `claude-<model>-<suffix>` like `-fast`, `-1024k`, `-200k`, `[1m]`, dated snapshots | These are deployment/routing identifiers, not the public model ID. **Do not assume a new-model equivalent exists.** Verify in the registry first; if absent, leave the string alone and flag it. **Exception: `-fast` strings (e.g. `claude-opus-4-6-fast`) are handled by the Fast Mode section below**, which rewrites them to Opus 4.8 plus `speed="fast"` and the `fast-mode-2026-02-01` beta rather than leaving them in place. |
**Bucket 3 sub-cases — before swapping a string reference, check:**
@@ -403,7 +403,7 @@ Legacy tool versions are not supported on 4+. **Both the `type` and the `name` f
| Old | New |
| ------------------------------------------------- | ------------------------------------------------------- |
| `text_editor_20250124` + `str_replace_editor` | `text_editor_20250728` + `str_replace_based_edit_tool` |
| `code_execution_*` (earlier versions) | `code_execution_20250825` |
| `code_execution_*` (earlier versions) | `code_execution_20260521` |
| `undo_edit` command | *(no longer supported — delete call sites)* |
```python
@@ -502,7 +502,7 @@ If the code uses the `AnthropicBedrockMantle` client (Python `anthropic[bedrock]
When migrating a Bedrock file, apply the same rename-table row as first-party, then keep/add the `anthropic.` prefix. Do **not** generate a first-party `claude-*` ID for a Bedrock client — it will 400.
**Skip for Bedrock:** the `code_execution_*` tool-version checklist item and the **Task Budgets** section — both are first-party-only features (Bedrock does not support server-side Anthropic tools or the `task-budgets-2026-03-13` beta). Everything else in this guide — `effort`, adaptive/extended thinking, `output_config.format`, `thinking.display`, fine-grained tool streaming, token counting — is available on Bedrock.
**Skip for Bedrock:** the `code_execution_*` tool-version checklist item and the **Task Budgets** section — neither is available on Bedrock (see `shared/platform-availability.md` for the per-feature table). Everything else in this guide — `effort`, adaptive/extended thinking, `output_config.format`, `thinking.display`, fine-grained tool streaming, token counting — is available on Bedrock.
> **Out of scope:** the legacy Amazon Bedrock integration (`InvokeModel` / `Converse` APIs with ARN-versioned IDs like `anthropic.claude-3-5-sonnet-20241022-v2:0`) uses a different request shape and model-ID format. This guide does not cover it; WebFetch the Bedrock page in `shared/live-sources.md` if the user is migrating between the two Bedrock integrations.
@@ -533,7 +533,7 @@ For each file that calls `messages.create()` / equivalent SDK method:
- [ ] **[BLOCKS]** Remove either `temperature` or `top_p` (passing both 400s on Claude 4+)
- [ ] **[BLOCKS]** Update text-editor tool `type` to `text_editor_20250728`
- [ ] **[BLOCKS]** Update text-editor tool `name` to `str_replace_based_edit_tool`**changing only the `type` and keeping `name: "str_replace_editor"` returns a 400**
- [ ] **[BLOCKS]** Update code-execution tool to `code_execution_20250825`
- [ ] **[BLOCKS]** Update code-execution tool to `code_execution_20260521`
- [ ] **[BLOCKS]** Delete any `undo_edit` command call sites
- [ ] **[TUNE]** Add handling for `stop_reason == "refusal"`
- [ ] **[TUNE]** Add handling for `stop_reason == "model_context_window_exceeded"` (4.5+)
@@ -681,18 +681,22 @@ Beyond resolution, Opus 4.7 also improves on low-level perception (pointing, mea
Requests that involve prohibited or high-risk topics may lead to refusals.
### Fast Mode: not available on Opus 4.7
### Fast Mode: Opus 4.8 / 4.7 only
Opus 4.7 does not have a Fast Mode variant. **Opus 4.6 Fast remains supported**. Only surface this if the caller's code actually uses a Fast Mode model string (e.g. `claude-opus-4-6-fast`); if the word "fast" does not appear in the code, say nothing about Fast Mode.
Fast mode is available on Opus 4.8 and Opus 4.7. Only surface this if the caller's code actually uses fast mode (e.g. `model="claude-opus-4-6-fast"`, or `speed="fast"` on an unsupported model); if the word "fast" does not appear in the code, say nothing about Fast Mode.
When you see `model="claude-opus-4-6-fast"` (or similar), **the migration edit is**:
When you see `model="claude-opus-4-6-fast"` (or any retired `-fast` model string), **the migration edit is** to move the fast-mode traffic onto Opus 4.8, the durable fast-capable tier:
```python
# Opus 4.7 has no Fast Mode — keeping on 4.6 Fast (caller's choice to switch to standard Opus 4.7).
model="claude-opus-4-6-fast",
# Request fast mode on Opus 4.8.
client.beta.messages.create(
model="claude-opus-4-8", max_tokens=4096,
speed="fast", betas=["fast-mode-2026-02-01"],
messages=[...],
)
```
That is: leave the model string **unchanged**, add the comment above it, and tell the user their two options — (a) stay on Opus 4.6 Fast, which remains supported, or (b) move latency-tolerant traffic to standard Opus 4.7 for the intelligence gain. Do **not** rewrite the model string to `claude-opus-4-7` yourself; that silently trades latency for intelligence, which is the caller's decision.
That is: switch the model to Opus 4.8 and request fast mode the supported way, using the beta `client.beta.messages.…` endpoint, the `fast-mode-2026-02-01` beta flag, and `speed="fast"` as a top-level request parameter (per-language form in SKILL.md § Fast Mode). Opus 4.7 also supports fast mode today, but it is itself being sunset (fast mode removed by default around Jul 25, 2026), so target Opus 4.8 as the durable choice rather than landing on a tier that is about to lose fast mode. Do **not** leave the code on a retired `-fast` model string — the failure mode differs by version: `claude-opus-4-6-fast` is already retired and the API **silently falls back** to standard Opus 4.6 (no error — the caller loses fast-mode speed without noticing); `claude-opus-4-7-fast`, once removed, will instead return an **API error** (hard failure — requests break outright rather than degrading). Either way, migrate to Opus 4.8 fast mode now.
### Behavioral shifts (prompt-tunable)
@@ -822,7 +826,7 @@ messages=[
]
```
Phrase these as **context, not commands**. State the fact and let Claude act on it; avoid override-style language ("ignore what the user said", "regardless of the user's request", "disregard the previous instruction"). Claude is trained to protect users from instructions that appear to work against them, and that protection applies to the system role too. This is a beta (`anthropic-beta: mid-conversation-system-2026-04-07`) and is available from Opus 4.7 onward, not 4.8-exclusive. For cache-placement details and the older-model `<system-reminder>` fallback, see `shared/prompt-caching.md` and `shared/agent-design.md`.
Phrase these as **context, not commands**. State the fact and let Claude act on it; avoid override-style language ("ignore what the user said", "regardless of the user's request", "disregard the previous instruction"). Claude is trained to protect users from instructions that appear to work against them, and that protection applies to the system role too. No beta header is required; available on Claude Opus 4.8. For cache-placement details and the older-model `<system-reminder>` fallback, see `shared/prompt-caching.md` and `shared/agent-design.md`.
### Capability improvements
@@ -882,7 +886,7 @@ For a caller **already on Opus 4.7**, only the first item is required; everythin
- [ ] **[TUNE]** Writing voice: re-evaluate style prompts added to counter 4.7's directness — 4.8 is warmer and less hedged by default; re-baseline before keeping them
- [ ] **[TUNE]** Code-review harnesses: keep the report-everything-filter-downstream pattern (4.8 follows "only high-severity" / "be conservative" filters literally, which can depress measured recall)
- [ ] **[TUNE]** Thinking-disabled paths: add a final-answer-only instruction if reasoning leaks into the visible response
- [ ] **[TUNE]** Consider mid-session system messages (`role:"system"` in `messages`, beta `mid-conversation-system-2026-04-07`) for context the app learns mid-session, instead of rebuilding the top-level system prompt and invalidating the cache
- [ ] **[TUNE]** Consider mid-session system messages (`role:"system"` in `messages`; no beta header) for context the app learns mid-session, instead of rebuilding the top-level system prompt and invalidating the cache
---
@@ -892,7 +896,7 @@ For a caller **already on Opus 4.7**, only the first item is required; everythin
Claude Fable 5 is Anthropic's most capable widely released model — for the most demanding reasoning and long-horizon agentic work. **Claude Mythos 5** (`claude-mythos-5`) offers the same capabilities, pricing, and API behavior through Project Glasswing (participation is the only way to access it), and succeeds the invitation-only **Claude Mythos Preview** (`claude-mythos-preview`). Everything in this section applies to both models — only the ID differs. Mythos Preview migrators in Project Glasswing target `claude-mythos-5`; everyone else targets `claude-fable-5`. 1M token context window by default (the maximum is also the default), up to 128K output tokens per request.
**Migrate to Claude Fable 5 only when the user explicitly chose it.** It is not the default Opus upgrade path — pricing is above Opus-tier and the new tokenizer changes cost baselines. For "upgrade to the latest model" requests, the target remains `claude-opus-4-8`.
**Migrate to Claude Fable 5 only when the user explicitly chose it.** It is not the default Opus upgrade path — pricing is above Opus-tier. For "upgrade to the latest model" requests, the target remains `claude-opus-4-8`.
### Breaking changes (vs Opus-tier and Mythos Preview)
@@ -920,28 +924,28 @@ Claude Fable 5 is Anthropic's most capable widely released model — for the mos
3. **Interleaved scratchpad is not supported** (Mythos Preview migrators only). Inter-tool reasoning is returned in thinking blocks instead, which adaptive thinking produces automatically between tool calls.
### Protected thinking — always encrypted, model-specific
### Thinking output on Claude Fable 5 and Claude Mythos 5
Claude Fable 5's `protected_thinking` policy protects the **raw chain of thought** — it is never exposed in responses. What you receive are **regular `thinking` blocks**, not encrypted blobs or `redacted_thinking`: `display: "summarized"` returns a readable summary of the reasoning, and with `"omitted"` — the default, same as Opus 4.8/4.7 — responses still include `thinking` blocks but the `thinking` field is an empty string. `display` controls visibility only; thinking happens and is billed the same under every setting. What's stricter on Claude Fable 5 is **replay**: pass thinking blocks back to the API **unchanged** when continuing a conversation on the same model (the standard multi-turn pattern; dropping or editing them breaks the turn).
On Claude Fable 5 and Claude Mythos 5, the raw chain of thought is never returned. What you receive are **regular `thinking` blocks**, not encrypted blobs or `redacted_thinking`: `display: "summarized"` returns a readable summary of the reasoning, and with `"omitted"` — the default, same as Opus 4.8/4.7 — responses still include `thinking` blocks but the `thinking` field is an empty string. `display` controls visibility only; thinking happens and is billed the same under every setting. When continuing a conversation on the same model, pass thinking blocks back to the API **unchanged** (the standard multi-turn pattern; dropping or editing them breaks the turn).
When continuing on the same model, pass each thinking block back **exactly as received — including blocks whose `thinking` text is empty**. The API rejects blocks whose content has been *modified*, not blocks you have read; displaying the summary is fine, editing or reconstructing blocks is not.
Thinking blocks are tied to the model that produced them, but cross-model replay is forgiving: other models **silently ignore** them rather than rejecting the request (early-access builds returned `invalid_request_error`; that was reverted before launch). Ignored blocks still bill input tokens, though — so when switching models for good, e.g. after a classifier refusal, strip `thinking`/`redacted_thinking` blocks from prior assistant turns to avoid paying for dead weight. Two exceptions: fallback-credit retries must echo the refused body **unchanged**, and `fallback` blocks from a mid-output fallback stay where they appeared.
Regular thinking blocks aren't origin-locked they replay across models fine (the server renders them into the target model's prompt). Claude Fable 5/Claude Mythos 5 thinking is the exception: a thinking block from these models replayed to a different model is **dropped from the prompt** rather than rendered — typically silently (early-access builds hard-rejected with `invalid_request_error`; that broke workflows and was reverted before launch, but the new behavior is still rolling out, so don't build logic that depends on either outcome). The drop happens before the prompt is priced, so a dropped block **lowers `usage.input_tokens`** — you aren't billed for it, and there's nothing to strip for cost. Don't strip *regular* thinking blocks either: removing them can trigger ordering/signature 400s. Two rules for replay bodies stand regardless: fallback-credit retries must echo the refused body **unchanged**, and `fallback` blocks from a mid-output fallback stay where they appeared.
Related: a request that tries to elicit the model's internal reasoning *in the response text* can be refused with `stop_details.category: "reasoning_extraction"` — applications needing reasoning visibility should read the summarized `thinking` blocks instead of prompting for reasoning.
### New tokenizer — re-baseline tokens and cost
### Tokenizer — unchanged from Opus 4.8
Claude Fable 5 uses a new tokenizer. The same content tokenizes to **roughly 30% more tokens** than on Opus-tier and older models (varies by content and workload shape). Billing is per token, so an unchanged workload can cost more after migration even before the per-token price difference.
Claude Fable 5 uses the **same tokenizer as Claude Opus 4.8** (the tokenizer introduced with Opus 4.7). Token counts are roughly unchanged when migrating from Opus 4.7/4.8 or from `claude-mythos-preview`; per-token pricing differs.
- Coming **from `claude-mythos-preview`**: token counts are roughly unchanged (same tokenizer family).
- Coming **from Opus/Sonnet/Haiku**: do not reuse token counts, context-window budgets, or `max_tokens` settings measured on the old model.
- Coming **from Opus 4.7/4.8 or `claude-mythos-preview`**: token counts are roughly unchanged. Re-baseline cost and latency on your own workloads for the per-token price difference.
- Coming **from Opus 4.6, Sonnet, Haiku, or older**: the Opus 4.7 tokenizer tokenizes the same content to roughly 1×1.35× as many tokens (varies by content and workload shape). Do not reuse token counts, context-window budgets, or `max_tokens` settings measured on the old model; re-baseline with `count_tokens`.
The token counting endpoint returns counts under **both** tokenizers when you pass `model: "claude-fable-5"` — `input_tokens` (new tokenizer, what you're billed) plus `input_tokens_prior_tokenizer` (the same request under the prior-generation tokenizer) — so you can measure the delta on your own prompts before switching.
To measure the difference on your own prompts, call `count_tokens` once with your current model and once with `model: "claude-fable-5"`, and compare the two `input_tokens` values.
### `refusal` stop reason — handle before reading content
Claude Fable 5 runs safety classifiers on incoming requests, targeting research biology and most cybersecurity content (Claude Fable 5 is not intended for those domains); benign adjacent work — security tooling, life-sciences tasks — can occasionally trigger false positives, which is why the fallback patterns below matter even for legitimate workloads. (Most Claude consumer surfaces ship with built-in Opus 4.8 fallbacks; API callers configure their own.) A declined request returns a **successful HTTP 200** with `stop_reason: "refusal"`, plus a `stop_details` object with the policy category (`"cyber"`, `"bio"`, `"reasoning_extraction"`, or `null` — treat `null` as a permanent valid state). **Branch on `stop_reason`, never on `stop_details`** — `stop_details` is informational and can be `null` even on a refusal, and `explanation` is not guaranteed present. Note that classifier blocks and ordinary model refusals (the model itself declining) both surface as `stop_reason: "refusal"`; `stop_details.category` tells you which class you're handling, and therefore whether retrying on a fallback model is the right response. The classifier can fire **before any output** (empty `content` array; not billed at all — no input or output tokens, no rate-limit consumption) or **mid-stream** after partial output (already-streamed output is billed at normal rates — discard the partial output rather than treating it as complete). Code that reads `response.content[0]` unconditionally will break — check `stop_reason` first:
Claude Fable 5 runs safety classifiers on incoming requests, targeting research biology and most cybersecurity content (Claude Fable 5 is not intended for those domains); benign adjacent work — security tooling, life-sciences tasks — can occasionally trigger false positives, which is why the fallback patterns below matter even for legitimate workloads. (Most Claude consumer surfaces ship with built-in Opus 4.8 fallbacks; API callers configure their own.) A declined request returns a **successful HTTP 200** with `stop_reason: "refusal"`, plus a `stop_details` object with the policy category (values such as `"cyber"`, `"bio"`, `"reasoning_extraction"`, `"frontier_llm"`, or `null` — treat `null` as a permanent valid state; see the refusal category table in the public docs for the full set). **Branch on `stop_reason`, never on `stop_details`** — `stop_details` is informational and can be `null` even on a refusal, and `explanation` is not guaranteed present. Note that classifier blocks and ordinary model refusals (the model itself declining) both surface as `stop_reason: "refusal"`; `stop_details.category` tells you which class you're handling, and therefore whether retrying on a fallback model is the right response. The classifier can fire **before any output** (empty `content` array; not billed at all — no input or output tokens, no rate-limit consumption) or **mid-stream** after partial output (already-streamed output is billed at normal rates — discard the partial output rather than treating it as complete). Code that reads `response.content[0]` unconditionally will break — check `stop_reason` first:
```python
response = client.messages.create(model="claude-fable-5", max_tokens=1024, messages=[...])
@@ -952,6 +956,8 @@ else:
print(response.content[0].text)
```
**Default to opting in.** Fallbacks are not automatic on the API — a request without them simply stops on a refusal. Migrated and new Claude Fable 5 code should ship with pattern 1 below (pattern 2 on providers without server-side support) from day one, not as a later hardening step: emit the opt-in in the code, tell the user it's there, and remove it only if they decline.
Three ways to retry a refused request on another model, in order of preference:
**1. Server-side `fallbacks` parameter (beta: Claude API and Claude Platform on AWS) — preferred.** One round trip, a plain client, no client-side logic. Name substitute models (the only supported fallback target at launch is `claude-opus-4-8`, expansion expected); on a policy decline the API runs the next model on the same request and returns its answer, with credit-style repricing applied automatically. A `stop_reason: "refusal"` on the final response means the whole chain refused.
@@ -970,23 +976,26 @@ for block in response.content:
if block.type == "fallback":
print(f"{block.from_.model} declined; {block.to.model} continued")
# Served-by signal: covers every fallback-served turn, INCLUDING sticky turns
# (sticky-served turns carry no fallback block — nothing declined this turn)
iterations = getattr(response.usage, "iterations", None) or []
if any(entry.type == "fallback_message" for entry in iterations):
# Served-by signal: a fallback_message in usage.iterations means a fallback model
# ran; pair it with stop_reason to confirm the fallback served the response
# (a fallback model can also refuse). Covers sticky turns too.
fallback_ran = any(
entry.type == "fallback_message" for entry in response.usage.iterations or []
)
if fallback_ran and response.stop_reason != "refusal":
print(f"Served by {response.model}")
```
Key semantics:
- **Header must be exactly `server-side-fallback-2026-06-01`** — other `server-side-fallback-*` values reject the `fallbacks` param with a 400. The current header carries the *earliest* date of the series (`-2026-06-09` and `-2026-06-02` were earlier previews) — do not "correct" it to a newer-looking date. Rejected on the Batches API; not available on Bedrock/Vertex (use pattern 2 there — the SDK middleware). Entries may override `max_tokens` per hop (bounding that attempt's own output independently of the top-level `max_tokens`); `thinking`, `output_config`, and `speed` overrides are rolling out (`speed` additionally requires its beta) — until your requests accept them, include only `model` and `max_tokens` in each entry. Entries must be distinct and must be in the requested model's `allowed_fallback_models` (visible on `/v1/models` under the beta). The request *with an entry's overrides merged in* must be valid as a direct request to that entry's model.
- **Header must be exactly `server-side-fallback-2026-06-01`** — other `server-side-fallback-*` values reject the `fallbacks` param with a 400. The current header carries the *earliest* date of the series (`-2026-06-09` and `-2026-06-02` were earlier previews) — do not "correct" it to a newer-looking date. Rejected on the Batches API; not available on Amazon Bedrock, Vertex AI, or Microsoft Foundry (use pattern 2 there — the SDK middleware). Entries may override `max_tokens` per hop (bounding that attempt's own output independently of the top-level `max_tokens`); `thinking`, `output_config`, and `speed` overrides are rolling out (`speed` additionally requires its beta) — until your requests accept them, include only `model` and `max_tokens` in each entry. Entries must be distinct and must be in the requested model's `allowed_fallback_models` (published on `/v1/models` when the `server-side-fallback-2026-06-01` beta header is set — not yet visible under the `fallback-credit-*` header alone, and not exposed on Amazon Bedrock, Vertex AI, or Microsoft Foundry). The request *with an entry's overrides merged in* must be valid as a direct request to that entry's model.
- **Triggers on policy declines only** — rate limits, overloads, and server errors on the requested model are returned as-is, never falling back.
- **Reading the response:** a `fallback` content block (`{"type": "fallback", "from": {"model": ...}, "to": {"model": ...}}`) marks each switch point in `content`; the served-by signal is a `fallback_message` entry in `usage.iterations` (don't rely on the block — sticky-served turns have none). Top-level `model` names the model that produced the message.
- **Billing:** `usage.iterations` is the per-attempt source of truth; top-level `usage` covers only the attempt that produced the returned message. Declined-before-output attempts are reported but not billed; fallback attempts bill at the fallback model's rates. Each attempt claims the rate limits of the model that ran it — if the fallback model is rate-limited or overloaded, the refusal is returned instead with `stop_details.recommended_model` naming the canonical model ID to retry directly (populated only when the request included `fallbacks` and the attempt couldn't be made) — size fallback-model limits for expected refusal volume.
- **Billing:** `usage.iterations` is the per-attempt source of truth; top-level `usage` covers only the attempt that produced the returned message. Declined-before-output attempts are reported but not billed; fallback attempts bill at the fallback model's rates. Each attempt claims the rate limits of the model that ran it — if the fallback model is rate-limited or overloaded, the fallback attempt is not made and the preceding refusal is returned instead with `stop_details.recommended_model` naming a model to retry directly (the recommendation is a hint, not a guarantee, and is `null` when no recommendation is available) — size fallback-model limits for expected refusal volume.
- **Sticky routing:** once a conversation falls back, later non-streaming requests with `fallbacks` are served directly by the fallback model for ~1 hour (best-effort; org-scoped content-hash record, not message content; not recorded for ZDR orgs). Handle the requested model being tried again at any time.
- **Echoing fallback turns back:** after a mid-output fallback, omit `thinking`, `redacted_thinking`, and `tool_use` blocks — plus any `server_tool_use` block without its matching `server_tool_result`, and any other unrecognized model-internal block type — that appear *before* the final `fallback` block; text blocks, paired server-tool blocks, and everything after the boundary echo normally. The `fallback` block itself is an ignored audit marker (keep or drop). Streaming: the retry happens on the same stream and already-received content is never invalidated — a pre-output block is seamless (`message_start` names the fallback model; the `fallback` block arrives as an ordinary `content_block_start`, first in `content` — there is no special SSE event type; note `message_start` arrives only after the declined attempt, so time-to-first-byte includes it), and a mid-stream block keeps the partial, marks the boundary with the block, and continues — only the partial's `text` blocks are passed to the fallback model as continuation context (other block types stay in `content` but aren't part of it). Sticky routing is **not consulted on streaming requests** in the initial release, so on streams the `fallback` block check is the complete signal; non-streaming mid-output declines omit the declined partial entirely.
**2. SDK client-side middleware — for providers without server-side fallbacks (Bedrock, Vertex).** Register it on the client and every `client.beta.messages` request (streaming included) retries refusals automatically, splicing the fallback model's events onto the open stream in the same wire shape as pattern 1 (a `fallback` content block at each boundary, per-hop `usage.iterations`). It is also a beta surface: the middleware sends the `fallback-credit-2026-06-01` header by default so retries are repriced via credit tokens (override with its `betas` option). `BetaFallbackState` pins follow-up turns to the model that accepted (the client-side analog of sticky routing) — reuse one state object per conversation:
**2. SDK client-side middleware — for providers without server-side fallbacks (Amazon Bedrock, Vertex AI, Microsoft Foundry).** Register it on the client and every `client.beta.messages` request (streaming included) retries refusals automatically, splicing the fallback model's events onto the open stream in the same wire shape as pattern 1 (a `fallback` content block at each boundary, per-hop `usage.iterations`). It is also a beta surface: the middleware sends the `fallback-credit-2026-06-01` header by default so retries are repriced via credit tokens (override with its `betas` option). `BetaFallbackState` pins follow-up turns to the model that accepted (the client-side analog of sticky routing) — reuse one state object per conversation:
```python
from anthropic import Anthropic, BetaFallbackState, BetaRefusalFallbackMiddleware
@@ -1005,7 +1014,7 @@ Create **one state per conversation** — it is the pinning scope; sharing one a
For languages not listed (Java, Ruby, PHP) — or for a full runnable program in any language — each public SDK repo ships a fallbacks example under `examples/` (e.g. `examples/fallbacks.py`, `examples/refusal-fallback/`): WebFetch the repo from `shared/live-sources.md` § SDK Repositories rather than improvising the binding.
**3. Hand-rolled retry + fallback credit (raw HTTP, or SDKs without the middleware).** Detect the refusal via `stop_reason` and re-send the conversation as-is on a model with broader availability such as `claude-opus-4-8` (Claude Fable 5's protected thinking blocks are silently ignored by other models — no stripping required); keep using the fallback model for subsequent turns. **Fallback credit** (beta: Claude API, Bedrock, Vertex) makes those retries cheaper. Prompt caches are per-model, so a plain retry pays cold cache-writes on the new model. With the `fallback-credit-2026-06-01` beta header (send it on both the original request and the retry), a refusal's `stop_details` carries `fallback_credit_token` (opaque; `null` when unavailable) and `fallback_has_prefill_claim`. Echo the token as the top-level `fallback_credit_token` request parameter on the retry (typed in the GA SDKs; on a pre-GA SDK pass it via `extra_body`) and the previously-cached span bills at cache-read rates — the retry costs what it would have if the conversation had been on that model all along. Rules: the retry body must match the refused request **exactly** in every prompt-shaping field (`system`, `messages`, `tools`, `tool_choice`, `thinking` — do **not** strip thinking blocks when redeeming a credit — the server handles them); the retry model must be in the refused model's `allowed_fallback_models`; the token expires in 5 minutes; Batches results carry no tokens. If `fallback_has_prefill_claim` is `true`, append one assistant message echoing the refused response's `content` — the retry model continues from where the refused model stopped (and completed server-tool work isn't re-run). When echoing, strip trailing whitespace from a final `text` block (the prefill validator rejects it; the credit match tolerates that edit), after omitting any unpaired `tool_use` blocks. On a 400, fall back to the unchanged body with the token; on a 400 naming `fallback_credit_token`, retry without it (credit forfeited).
**3. Hand-rolled retry + fallback credit (raw HTTP, or SDKs without the middleware).** Detect the refusal via `stop_reason` and re-send the conversation as-is on a model with broader availability such as `claude-opus-4-8` (Claude Fable 5's thinking blocks are silently ignored by other models — no stripping required); keep using the fallback model for subsequent turns. **Fallback credit** (beta: Claude API, Claude Platform on AWS, Amazon Bedrock, Vertex AI, and Microsoft Foundry) makes those retries cheaper. Prompt caches are per-model, so a plain retry pays cold cache-writes on the new model. With the `fallback-credit-2026-06-01` beta header (send it on both the original request and the retry), a refusal's `stop_details` carries `fallback_credit_token` (opaque; `null` when unavailable) and `fallback_has_prefill_claim`. Echo the token as the top-level `fallback_credit_token` request parameter on the retry (typed in the GA SDKs; on a pre-GA SDK pass it via `extra_body`) and the previously-cached span bills at cache-read rates — the retry costs what it would have if the conversation had been on that model all along. Rules: the retry body must match the refused request **exactly** in every prompt-shaping field (`system`, `messages`, `tools`, `tool_choice`, `thinking` — do **not** strip thinking blocks when redeeming a credit — the server handles them); the retry model must be in the refused model's `allowed_fallback_models`; the token expires in 5 minutes; Batches results carry no tokens. If `fallback_has_prefill_claim` is `true`, append one assistant message echoing the refused response's `content` — the retry model continues from where the refused model stopped (and completed server-tool work isn't re-run). When echoing, strip trailing whitespace from a final `text` block (the prefill validator rejects it; the credit match tolerates that edit), after omitting any unpaired `tool_use` blocks. On a 400, fall back to the unchanged body with the token; on a 400 naming `fallback_credit_token`, retry without it (credit forfeited).
**Migrating code built on the v1 preview.** If the code you're editing carries any of these markers, it targets the discontinued early-access surface — migrate it to the v2 shapes above, and ship the header and parameter changes together (the v1 parameter shape under the v2 header is a 400):
@@ -1096,7 +1105,7 @@ None of these are API-breaking, but they're where migrated workloads feel differ
}
```
For agents that only narrate routine progress, default summaries are typically adequate without this tool.
For agents that only narrate routine progress, the model's default progress narration is typically adequate without this tool.
### Claude Fable 5 Migration Checklist
@@ -1105,9 +1114,10 @@ For agents that only narrate routine progress, default summaries are typically a
- [ ] **[BLOCKS]** Replace assistant prefill with structured outputs or system prompt instructions
- [ ] **[BLOCKS]** Confirm the org meets the 30-day data-retention requirement (ZDR orgs get `400 invalid_request_error` on every request)
- [ ] **[BLOCKS]** Remove all other `thinking` configuration (`{type: "enabled", budget_tokens: N}` returns a 400, same as on Opus 4.7/4.8); control depth with `output_config.effort` instead
- [ ] **[TUNE]** Re-baseline token counts, context budgets, `max_tokens`, and cost — ~30% more tokens vs Opus-tier (roughly unchanged from Mythos Preview); use `count_tokens` with `model: "claude-fable-5"` to measure
- [ ] **[TUNE]** Add `stop_reason == "refusal"` handling before reading `response.content` (pre-output: empty + unbilled; mid-stream: partial output billed — discard); pick a retry strategy — client-side (replay history as-is; other models ignore Fable's thinking blocks), fallback credit (`fallback-credit-2026-06-01`, exact body), or server-side `fallbacks` (`server-side-fallback-2026-06-01`, Claude API and Claude Platform on AWS)
- [ ] **[TUNE]** If you surfaced thinking text to users, plan for protected (encrypted) thinking — pass blocks back unchanged on the same model, never render or cross-model them
- [ ] **[BLOCKS]** If thinking content is surfaced to users or stored in logs: add `thinking: {type: "adaptive", display: "summarized"}` (the default is `"omitted"` — otherwise the rendered text is empty)
- [ ] **[TUNE]** Re-baseline cost and latency on your own workloads — token counts are roughly unchanged from Opus 4.7/4.8 and Mythos Preview (same tokenizer); per-token pricing differs. Coming from Opus 4.6, Sonnet, Haiku, or older, token counts differ — use `count_tokens` with each model to compare
- [ ] **[TUNE]** Add `stop_reason == "refusal"` handling before reading `response.content` (pre-output: empty + unbilled; mid-stream: partial output billed — discard); opt into a fallback by default — server-side `fallbacks` (`server-side-fallback-2026-06-01`, Claude API and Claude Platform on AWS) where available, otherwise the SDK middleware or fallback credit (`fallback-credit-2026-06-01`, exact body); a bare client-side replay (history as-is; other models drop Fable's thinking blocks) is the floor, not the recommendation
- [ ] **[TUNE]** If you surfaced thinking text to users, plan for the thinking output change — the raw chain of thought is never returned; render the `display: "summarized"` summary (per the [BLOCKS] item above); pass blocks back unchanged on the same model; other models drop them from the prompt (unbilled)
- [ ] **[TUNE]** Plan for minutes-long turns: timeouts, streaming, async check-ins, progress UX (see Behavior changes above)
- [ ] **[TUNE]** Run an effort sweep including low/medium for routine workloads; add the no-tidying instruction if higher effort produces unrequested refactors
- [ ] **[TUNE]** A/B with prior-model scaffolding removed — over-prescriptive prompts/skills reduce Claude Fable 5 output quality

View File

@@ -66,7 +66,7 @@ curl https://api.anthropic.com/v1/models/claude-opus-4-8 \
| Claude Haiku 4.5 | `claude-haiku-4-5` | `claude-haiku-4-5-20251001` | 200K | 64K | Active |
### Model Descriptions
- **Claude Fable 5** — Anthropic's most capable widely released model, for the most demanding reasoning and long-horizon agentic work. Same API surface as Opus 4.7/4.8 with one new breaking change: an explicit `thinking: {type: "disabled"}` returns a 400 — omit the `thinking` parameter instead (thinking is always on, returned in protected/encrypted form). New tokenizer (~30% more tokens than Opus-tier for the same content). Safety classifiers may return `stop_reason: "refusal"`. No assistant prefill. Requires 30-day data retention (not available under ZDR). $10/$50 per MTok; 1M context window (default), 128K max output. See `shared/model-migration.md` → Migrating to Claude Fable 5.
- **Claude Fable 5** — Anthropic's most capable widely released model, for the most demanding reasoning and long-horizon agentic work. Same API surface as Opus 4.7/4.8 with one new breaking change: an explicit `thinking: {type: "disabled"}` returns a 400 — omit the `thinking` parameter instead (thinking is always on; the raw chain of thought is never returned — summaries via `display: "summarized"`). Same tokenizer as Opus 4.8 (token counts roughly unchanged vs Opus 4.7/4.8). Safety classifiers may return `stop_reason: "refusal"`. No assistant prefill. Requires 30-day data retention (not available under ZDR). $10/$50 per MTok; 1M context window (default), 128K max output. See `shared/model-migration.md` → Migrating to Claude Fable 5.
- **Claude Mythos 5** — Same capabilities, pricing, limits, and API behavior as Claude Fable 5; only the model ID differs. Available exclusively through Project Glasswing, where it joins (and succeeds) the invitation-only Claude Mythos Preview (`claude-mythos-preview`). Use it only when the org participates in Project Glasswing; otherwise use claude-fable-5.
- **Claude Opus 4.8** — The most capable Opus-tier model — highly autonomous, state-of-the-art on long-horizon agentic work, knowledge work, and memory; clearer, warmer writing. Same API surface as Opus 4.7 (adaptive thinking only; sampling parameters and `budget_tokens` removed). 1M context window at standard API pricing (no long-context premium). See `shared/model-migration.md` → Migrating to Opus 4.8 — a 4.7 → 4.8 move is a model-ID swap plus prompt re-tuning, no new breaking changes.
- **Claude Opus 4.7** — Previous-generation Opus. Highly autonomous; strong on long-horizon agentic work, knowledge work, vision, and memory. Adaptive thinking only; sampling parameters and `budget_tokens` removed. 1M context window. See `shared/model-migration.md` → Migrating to Opus 4.7.

View File

@@ -0,0 +1,96 @@
# Platform Availability
Which features work on which provider platform. **This table is the single source of truth in this skill** — per-feature sections elsewhere point here instead of restating availability. When writing code for a third-party platform (Bedrock, Vertex, Foundry) or Claude Platform on AWS, check this table first; a feature not supported there means use the first-party Claude API surface or a different approach.
Columns: **1P** = first-party Claude API, **P-AWS** = Claude Platform on AWS (Anthropic-operated, same-day parity), **Bedrock** = Amazon Bedrock, **Vertex** = Google Cloud Vertex AI, **Foundry** = Microsoft Foundry. ✅ = GA, β = beta, ❌ = not supported.
| Feature | 1P | P-AWS | Bedrock | Vertex | Foundry | Notes |
|---|---|---|---|---|---|---|
| Messages, streaming, tool use | ✅ | ✅ | ✅ | ✅ | ✅ | Core API |
| PDF input | ✅ | ✅ | ✅ | ✅ | β | |
| Structured outputs / strict tool use | ✅ | ✅ | ✅ | ✅ | β | |
| Adaptive thinking / effort | ✅ | ✅ | ✅ | ✅ | β | |
| Extended thinking | ✅ | ✅ | ✅ | ✅ | β | |
| Prompt caching (5m, 1h) | ✅ | ✅ | ✅ | ✅ | β | |
| Automatic prompt caching | ✅ | ✅ | ❌ | ❌ | β | |
| Token counting | ✅ | ✅ | ✅ | ✅ | β | |
| Citations | ✅ | ✅ | ✅ | ✅ | β | |
| Search results content blocks | ✅ | ✅ | ✅ | ✅ | β | |
| Fine-grained tool streaming | ✅ | ✅ | ✅ | ✅ | ✅ | |
| Compaction | β | β | β | β | β | |
| Context editing | β | β | β | β | β | |
| Context windows (1M) | ✅ | ✅ | ✅ | ✅ | β | |
| `inference_geo` (data residency) | ✅ | ✅ | ❌ | ❌ | ❌ | |
| **Server-side tools** | | | | | | |
| &nbsp;&nbsp;Web search | ✅ | ✅ | ❌ | ✅ | β | Vertex: basic `web_search_20250305` only (no `_20260209` dynamic filtering) |
| &nbsp;&nbsp;Web fetch | ✅ | ✅ | ❌ | ❌ | β | |
| &nbsp;&nbsp;Code execution | ✅ | ✅ | ❌ | ❌ | β | |
| &nbsp;&nbsp;Tool search | ✅ | ✅ | ✅ | ✅ | β | Bedrock: InvokeModel API only, not Converse |
| &nbsp;&nbsp;Advisor tool | β | β | ❌ | ❌ | ❌ | |
| **Client-implemented tools** | | | | | | |
| &nbsp;&nbsp;Bash, text editor, memory | ✅ | ✅ | ✅ | ✅ | β | |
| &nbsp;&nbsp;Computer use | β | β | β | β | β | |
| **Agentic / orchestration** | | | | | | |
| &nbsp;&nbsp;Agent Skills (Messages API) | β | β | ❌ | ❌ | β | |
| &nbsp;&nbsp;Programmatic tool calling | ✅ | ✅ | ❌ | ❌ | β | |
| &nbsp;&nbsp;MCP connector | β | β | ❌ | ❌ | β | |
| &nbsp;&nbsp;Managed Agents | β | β | ❌ | ❌ | ❌ | Foundry ❌ inferred (not in Foundry docs either way) |
| &nbsp;&nbsp;Self-hosted sandboxes | β | β | ❌ | ❌ | ❌ | P-AWS: `GET /v1/environments/{id}/work` list endpoint not supported; other work endpoints OK |
| **API endpoints** | | | | | | |
| &nbsp;&nbsp;Message Batches | ✅ | ✅ | ❌ | ❌ | ❌ | |
| &nbsp;&nbsp;Files API | β | β | ❌ | ❌ | β | |
| &nbsp;&nbsp;Models API | ✅ | ✅ | ❌ | ❌ | ❌ | |
| **Other** | | | | | | |
| &nbsp;&nbsp;Mid-conversation system messages | ✅ | ✅ | ❌ | ❌ | ❌ | Claude Opus 4.8 only |
| &nbsp;&nbsp;Fast mode | β | ❌ | ❌ | ❌ | ❌ | Research preview, beta `fast-mode-2026-02-01`, first-party API only |
| &nbsp;&nbsp;Cache diagnostics | β | ❌ | ❌ | ❌ | ❌ | First-party API only |
| &nbsp;&nbsp;Task budgets | β | β | ❌ | ❌ | ❌ | Beta header `task-budgets-2026-03-13`; 3P availability not documented — assume unsupported |
<!--
GROUNDING (reviewer-only; stripped at runtime by processSkillMarkdown).
All paths are under docker_eval/resources/cdp-skill/public-docs/.
Primary source: build-with-claude/overview.mdx <PlatformAvailability> props
(claudeApi→1P, claudePlatformAws→P-AWS, bedrock→Bedrock, vertexAi→Vertex,
azureAi→Foundry; *Beta suffix→β; prop absent→❌). Per-row citations:
Context windows ov:44
Adaptive thinking ov:45
Batch / Message Batches ov:46; bed:360; vtx:381; fdy:507
Citations ov:47
inference_geo ov:48
Effort ov:49
Extended thinking ov:50
PDF input ov:51
Search results ov:52
Structured outputs ov:53
Advisor tool ov:63
Code execution ov:64
Web fetch ov:65
Web search ov:66; agents-and-tools/tool-use/web-search-tool.mdx:41
Bash/text-editor/memory ov:72,75,74
Computer use ov:73
Agent Skills ov:83
Fine-grained streaming ov:84
MCP connector ov:85; agents-and-tools/mcp-connector.mdx:36
Programmatic tool call ov:86
Tool search ov:87; agents-and-tools/tool-use/tool-search-tool.mdx:24-30
Compaction ov:95
Context editing ov:96
Automatic caching ov:97
Prompt caching 5m/1h ov:98,99
Token counting ov:100
Files API ov:108; build-with-claude/files.mdx:17
Managed Agents managed-agents/overview.mdx:11,70-72; bed:360; vtx:381
Self-hosted sandboxes build-with-claude/claude-platform-on-aws.mdx:525,547
Mid-convo system msgs build-with-claude/mid-conversation-system-messages.mdx:15
Fast mode build-with-claude/fast-mode.mdx:23
Cache diagnostics build-with-claude/cache-diagnostics.mdx:15,1379
Task budgets build-with-claude/task-budgets.mdx:15
Models API bed:360; vtx:381; fdy:506
ov = build-with-claude/overview.mdx
bed = build-with-claude/claude-in-amazon-bedrock.mdx
vtx = build-with-claude/claude-on-vertex-ai.mdx
fdy = build-with-claude/claude-in-microsoft-foundry.mdx
-->

View File

@@ -64,7 +64,7 @@ Many requests share a large fixed preamble (few-shot examples, retrieved docs, i
### Mid-conversation system messages
**Beta, model-gated.** When an operator instruction arrives mid-conversation — a mode switch, updated context, dynamically injected state — send it as `{"role": "system", "content": "..."}` appended to `messages[]`, rather than editing top-level `system`. Editing top-level `system` changes the prefix ahead of the entire conversation history, so every cached turn is re-processed uncached; a `role: "system"` message sits after the history and leaves the cached prefix intact.
**Claude Opus 4.8 only; no beta header.** When an operator instruction arrives mid-conversation — a mode switch, updated context, dynamically injected state — send it as `{"role": "system", "content": "..."}` appended to `messages[]`, rather than editing top-level `system`. Editing top-level `system` changes the prefix ahead of the entire conversation history, so every cached turn is re-processed uncached; a `role: "system"` message sits after the history and leaves the cached prefix intact.
```json
// Top-level system stays byte-identical; new instruction goes after the cached history
@@ -78,7 +78,7 @@ Many requests share a large fixed preamble (few-shot examples, retrieved docs, i
This is also the prompt-injection-safe replacement for embedding operator instructions as text inside a user turn (the `<system-reminder>` pattern): both have the same caching profile, but `role: "system"` is the non-spoofable operator channel, whereas text inside user/tool content can be forged by anything that writes to user-visible input.
Requires `anthropic-beta: mid-conversation-system-2026-04-07`. Must follow a `role: "user"` message (or an assistant message ending in a server tool result); cannot be `messages[0]` — use top-level `system` for the initial prompt. Content is text-only. Model-gated — unsupported models return a 400 (`BadRequestError`: `role 'system' is not supported on this model`); catch that error and fall back to putting the instruction in a user-turn `<system-reminder>` block.
Available on Claude Opus 4.8; no beta header is required. Must follow a `role: "user"` message (or an `assistant` message ending in server-tool use), and must be either the last entry in `messages` or be followed by an `assistant` turn; cannot be `messages[0]` — use top-level `system` for the initial prompt. Content is text-only. Unsupported models return a 400 (`BadRequestError`: `role 'system' is not supported on this model`); catch that error and fall back to putting the instruction in a user-turn `<system-reminder>` block.
### Prompts that change from the beginning every time

View File

@@ -212,13 +212,56 @@ For full documentation, use WebFetch:
---
## Skills
## Agent Skills (Messages API)
Skills package task-specific instructions that Claude loads only when relevant. Each skill is a folder containing a `SKILL.md` file. The skill's short description sits in context by default; Claude reads the full file when the current task calls for it. Use skills to keep specialized instructions out of the base system prompt without losing discoverability.
Agent Skills package task-specific instructions and files that Claude loads when relevant (e.g., the Anthropic pre-built `pptx`, `xlsx`, `pdf`, `docx` skills). On the **Messages API**, skills are enabled via the `container` parameter alongside the code-execution tool — this is **not** the Managed Agents surface and does **not** use `client.beta.agents` / `sessions` / `environments`. Availability: see `shared/platform-availability.md`.
For full documentation, use WebFetch:
Required on each request:
- URL: `https://platform.claude.com/docs/en/agents-and-tools/skills`
1. `client.beta.messages.create(...)` with **both** beta flags: `code-execution-2025-08-25` **and** `skills-2025-10-02`.
2. `container={"skills": [{"type": "anthropic", "skill_id": "<id>", "version": "latest"}]}` — the skills list selects which skills are available inside the execution container.
3. `tools=[{"type": "code_execution_20260521", "name": "code_execution"}]` — skills execute via code execution in the container.
```python
response = client.beta.messages.create(
model="claude-opus-4-8", max_tokens=16000,
betas=["code-execution-2025-08-25", "skills-2025-10-02"],
container={"skills": [{"type": "anthropic", "skill_id": "pptx", "version": "latest"}]},
tools=[{"type": "code_execution_20260521", "name": "code_execution"}],
messages=[{"role": "user", "content": "Create a 3-slide presentation on X"}],
)
```
Generated files (`.pptx`, `.xlsx`, …) are written inside the container; the response carries a file ID for each. Download by passing that ID to the Files API (`client.beta.files.download(file_id)` / `GET /v1/files/{id}/content` with `anthropic-beta: files-api-2025-04-14`).
List available skills via `GET /v1/skills` (requires `anthropic-beta: skills-2025-10-02`).
---
## MCP Connector (Beta)
The MCP connector lets Claude call tools hosted on a remote MCP server directly from the Messages API — Anthropic makes the MCP connection server-side. Requires beta flag `mcp-client-2025-11-20` on `client.beta.messages.create(...)`. Availability: see `shared/platform-availability.md`.
**Two parameters are required together:**
- `mcp_servers` — array of server connection definitions: `[{"type": "url", "url": "<server URL>", "name": "<server-name>", "authorization_token": "<optional>"}]`
- `tools` — must include an `mcp_toolset` entry that references the server by name: `[{"type": "mcp_toolset", "mcp_server_name": "<server-name>"}]`
The `mcp_server_name` in the toolset must match a `name` in `mcp_servers`. Omitting the `mcp_toolset` entry is rejected as a validation error — every server in `mcp_servers` must be referenced by exactly one toolset.
```python
client.beta.messages.create(
model="claude-opus-4-8", max_tokens=1024,
betas=["mcp-client-2025-11-20"],
mcp_servers=[{"type": "url", "url": "https://example/sse", "name": "example-mcp"}],
tools=[{"type": "mcp_toolset", "mcp_server_name": "example-mcp"}],
messages=[...],
)
```
Go uses the typed constant `anthropic.AnthropicBetaMCPClient2025_11_20`; the older `…2025_04_04` constant is deprecated.
Optional toolset fields: `default_config` (defaults for all tools, e.g. `{"enabled": false}` for allowlist mode) and `configs` (per-tool overrides keyed by tool name).
---
@@ -232,9 +275,9 @@ For full documentation, use WebFetch:
---
## Server-Side Tools: Computer Use
## Client-Side Tools: Computer Use
Computer use lets Claude interact with a desktop environment (screenshots, mouse, keyboard). It can be Anthropic-hosted (server-side, like code execution) or self-hosted (you provide the environment and execute actions client-side).
Computer use lets Claude interact with a desktop environment (screenshots, mouse, keyboard). It is a client-side tool — your application provides the environment and executes the actions Claude requests; Anthropic processes the screenshots and action requests in real time but does not host the environment or retain the data.
For full documentation, use WebFetch:
@@ -244,7 +287,9 @@ For full documentation, use WebFetch:
## Context Editing
Context editing clears stale tool results and thinking blocks from the transcript as a long-running agent accumulates turns. Unlike compaction (which summarizes), context editing prunes — the cleared content is removed, not replaced. Use it when old tool outputs are no longer relevant and you want to keep the transcript lean without losing the conversation structure. Thresholds for what to clear are configurable.
Context editing clears stale tool results and thinking blocks from the transcript as a long-running agent accumulates turns. Unlike compaction (which summarizes), context editing prunes — the cleared content is removed, not replaced. Use it when old tool outputs are no longer relevant and you want to keep the transcript lean without losing the conversation structure.
**Beta.** Use `client.beta.messages.*` with beta `context-management-2025-06-27`. Configure via `context_management.edits` with a strategy type of `clear_tool_uses_20250919` (clear old tool results; optional `clear_tool_inputs: true` also clears the tool_use params) or `clear_thinking_20251015` (clear thinking blocks). These are **not** the compaction types — `compact_20260112` with beta `compact-2026-01-12` is the separate compaction feature.
For full documentation, use WebFetch:
@@ -254,7 +299,7 @@ For full documentation, use WebFetch:
## Server-Side Tools: Advisor (Beta)
The advisor tool lets Claude consult a secondary model during a conversation. The advisor runs its own API call with a model you specify and returns its analysis to the primary model. Use it when you want a second opinion, specialized expertise, or cross-model verification without managing the orchestration yourself.
The advisor tool pairs a faster, lower-cost **executor** model (the top-level `model` on the request) with a higher-intelligence **advisor** model (the `model` field inside the tool definition) that provides strategic guidance mid-generation. The executor does most of the token generation; the advisor is consulted for planning. Availability: see `shared/platform-availability.md`.
### Tool Definition
@@ -262,13 +307,18 @@ The advisor tool lets Claude consult a secondary model during a conversation. Th
{
"type": "advisor_20260301",
"name": "advisor",
"model": "claude-sonnet-4-6"
"model": "claude-opus-4-8"
}
```
The `model` parameter is required — it specifies which model the advisor uses for its own inference. Optional fields: `caching`, `max_uses`, `allowed_callers`, `defer_loading`, `strict`.
**The advisor model must be at least as capable as the executor.** An invalid pairing returns `400 invalid_request_error`. Valid pairs:
**Beta header required:** `advisor-tool-2026-03-01`. The SDK sets this automatically when using `client.beta.messages.create()` with advisor tools.
| Executor (request `model`) | Valid advisor (tool `model`) |
|---|---|
| `claude-haiku-4-5` / `claude-sonnet-4-6` / `claude-opus-4-6` / `claude-opus-4-7` | `claude-opus-4-8` or `claude-opus-4-7` |
| `claude-opus-4-8` | `claude-opus-4-8` only |
Call via `client.beta.messages.create(...)` with `betas=["advisor-tool-2026-03-01"]` (or the `anthropic-beta: advisor-tool-2026-03-01` header). In multi-turn conversations, append the full `response.content` — including any `advisor_tool_result` blocks — back to `messages` on the next turn. If you remove the advisor tool from `tools` on a later turn while the history still contains `advisor_tool_result` blocks, the API returns a 400.
---
@@ -291,6 +341,53 @@ For full implementation examples, use WebFetch:
---
## Client-Side Tools: Bash and Text Editor
The bash and text editor tools are **Anthropic-defined, schema-less** tools. Declare them by `type` and `name` only — the input schema is built into the model and cannot be modified. **Do not pass an `input_schema`**, and do not define a custom tool that happens to be named `"bash"` — that creates a user-defined tool without the built-in behavior.
Both are **client-executed**: Claude returns a `tool_use` block, your code performs the action locally, and you send back a `tool_result`. The API is stateless; your application maintains the shell session or filesystem between turns.
### Bash tool declaration
```json
{"type": "bash_20250124", "name": "bash"}
```
| Language | Declaration |
|---|---|
| Python / TypeScript / Ruby / cURL | plain object `{"type": "bash_20250124", "name": "bash"}` |
| Go | `anthropic.ToolUnionParam{OfBashTool20250124: &anthropic.ToolBash20250124Param{}}` |
| Java | `.addTool(ToolBash20250124.builder().build())` from `com.anthropic.models.messages` |
| C# | `Tools = [new ToolBash20250124()]` from `Anthropic.Models.Messages` |
| PHP | `tools: [new \Anthropic\Messages\ToolBash20250124()]` |
Claude's `tool_use.input` contains either `{"command": "<string>"}` or `{"restart": true}`. Check for `restart` first (reset the session, return a confirmation string); otherwise run `command` and return combined stdout + stderr.
> **Security — commands are untrusted model output.** Run in an isolated environment (container, VM, or restricted user); apply an **allowlist** of permitted executables and reject shell operators (`&&`, `|`, `;`, `` ` ``, `$()`); set timeouts and resource limits; log every command. A blocklist is not sufficient.
### Text editor tool declaration
```json
{"type": "text_editor_20250728", "name": "str_replace_based_edit_tool"}
```
Optional field: `max_characters` to cap `view` output. Java exposes a typed `ToolTextEditor20250728` builder (`com.anthropic.models.messages`); other statically-typed SDKs follow the same naming pattern — see the Anthropic-Defined Tools section in `{lang}/claude-api/tool-use.md` for the exact class.
> **Security — `path` is untrusted model output. Confine every file operation to a fixed project root.** Before executing any command, resolve the model-supplied `path` to its canonical form and verify it remains within your project root; reject the request if it escapes (`..`, symlinks, absolute paths outside the root, URL-encoded traversal like `%2e%2e%2f`). Use your language's built-in path utilities (e.g., Python `pathlib.Path.resolve()` then check `.is_relative_to(root)`). Never call `open()` / `writeFile` / `unlink` directly on the raw `path` value.
`tool_use.input.command` is one of:
| `command` | Other inputs | Action |
|---|---|---|
| `view` | `path`, optional `view_range` | Return file contents or directory listing |
| `create` | `path`, `file_text` | Create/overwrite file with `file_text`. Create a backup if the file already exists. |
| `str_replace` | `path`, `old_str`, `new_str` | Replace exactly one occurrence; error if 0 or >1 matches |
| `insert` | `path`, `insert_line`, `insert_text` | Insert `insert_text` after line `insert_line` (0 = beginning of file) |
For both tools, on error return `{"type": "tool_result", "tool_use_id": "…", "content": "<error text>", "is_error": true}` so Claude can recover.
---
## Structured Outputs
Structured outputs constrain Claude's responses to follow a specific JSON schema, guaranteeing valid, parseable output. This is not a separate tool — it enhances the Messages API response format and/or tool parameter validation.

View File

@@ -1,11 +1,17 @@
# Claude API — TypeScript
| Feature | Namespace | Key types / call |
|---|---|---|
| User profiles | beta | `client.beta.userProfiles.create(...)` / `.retrieve(id)` / `.list()`. Pass the returned profile id on `client.beta.messages.create`. Requires a beta header — check the SDK's beta-headers reference for the current flag. |
## Installation
```bash
npm install @anthropic-ai/sdk
```
> **Reading local files (ESM):** `__dirname` and `__filename` are **undefined** in ES modules — using either throws `ReferenceError: __dirname is not defined` at runtime. For cwd-relative reads, pass the bare relative path (`fs.readFileSync("./sample.png")`). For script-relative paths, derive the directory from `import.meta.url`: `const here = path.dirname(fileURLToPath(import.meta.url))`. Never write `path.join(__dirname, …)` in an ESM `.ts` file.
## Client Initialization
```typescript
@@ -53,16 +59,13 @@ const response = await client.messages.create({
});
```
### Mid-conversation system messages (beta, model-gated)
### Mid-conversation system messages (model-gated)
For operator instructions that arrive mid-conversation (mode switches, injected state), append `{role: "system", ...}` to `messages` instead of editing top-level `system` — this preserves the cached prefix and carries operator authority. Must follow a user message; cannot be `messages[0]`. Unsupported models return a 400 (`role 'system' is not supported on this model`). See `shared/prompt-caching.md` for when to use this vs. top-level `system`.
For operator instructions that arrive mid-conversation (mode switches, injected state), append `{role: "system", ...}` to `messages` instead of editing top-level `system` — this preserves the cached prefix and carries operator authority. Must follow a user message (or an `assistant` message ending in server-tool use), and must be either the last entry in `messages` or be followed by an `assistant` turn; cannot be `messages[0]`. Unsupported models return a 400 (`role 'system' is not supported on this model`). See `shared/prompt-caching.md` for when to use this vs. top-level `system`.
```typescript
// SDK types for role:"system" in messages are pending — pass the beta header
// directly until the SDK updates, then switch to client.beta.messages.create
// with betas: ["mid-conversation-system-2026-04-07"].
const response = await client.messages.create(
{
// No beta header needed — use regular client.messages.create.
const response = await client.messages.create({
model: MODEL_ID, // must support mid-conversation system messages
max_tokens: 16000,
system: [
@@ -71,12 +74,9 @@ const response = await client.messages.create(
messages: [
...history,
{ role: "user", content: userMessage },
// @ts-expect-error — role:"system" pending SDK types
{ role: "system", content: "Terse mode enabled — keep responses under 40 words." },
],
},
{ headers: { "anthropic-beta": "mid-conversation-system-2026-04-07" } },
);
});
```
---
@@ -204,7 +204,7 @@ If `cache_read_input_tokens` is zero across repeated identical-prefix requests,
const response = await client.messages.create({
model: "claude-opus-4-8",
max_tokens: 16000,
thinking: { type: "adaptive" },
thinking: { type: "adaptive", display: "summarized" }, // display opt-in: default is omitted (empty thinking text) on Fable 5 / Mythos 5 / Opus 4.8 / 4.7
output_config: { effort: "high" }, // low | medium | high | max
messages: [
{ role: "user", content: "Solve this math problem step by step..." },
@@ -333,11 +333,43 @@ When `stop_reason` is `"refusal"`, the response includes a `stop_details` object
```typescript
if (response.stop_reason === "refusal" && response.stop_details) {
console.log(`Category: ${response.stop_details.category}`); // "cyber" | "bio" | null
console.log(`Category: ${response.stop_details.category}`); // e.g. "cyber", "bio", "reasoning_extraction", "frontier_llm", or null — see docs for the full set
console.log(`Explanation: ${response.stop_details.explanation}`);
}
```
### Refusal Fallbacks (Claude Fable 5) — opt in by default
Fallbacks are **opt-in**: without them a refused request simply stops. Include the server-side `fallbacks` parameter in `claude-fable-5` code by default — on a policy decline the API re-runs the same request on the fallback model inside the same call. A decline before any output isn't billed (a mid-stream decline bills the streamed partial); the rescue bills at the fallback model's own rates, with cache repricing applied automatically.
```typescript
const response = await client.beta.messages.create({
model: "claude-fable-5",
max_tokens: 16000,
betas: ["server-side-fallback-2026-06-01"],
fallbacks: [{ model: "claude-opus-4-8" }],
messages: [{ role: "user", content: "..." }],
});
// Switch points: one fallback block per model that ran and declined this turn
for (const block of response.content) {
if (block.type === "fallback") {
console.log(`${block.from.model} declined; ${block.to.model} continued`);
}
}
// Served-by signal — covers sticky turns, which carry no fallback block.
// Pair with stop_reason: the fallback model can itself refuse.
const fallbackRan = (response.usage.iterations ?? []).some(
(entry) => entry.type === "fallback_message",
);
if (fallbackRan && response.stop_reason !== "refusal") {
console.log(`Served by ${response.model}`);
}
```
A `stop_reason: "refusal"` on the final response means the whole chain refused. The header must be exactly `server-side-fallback-2026-06-01`; the parameter is rejected on the Batches API and unavailable on Amazon Bedrock, Vertex AI, and Microsoft Foundry — register the client-side `betaRefusalFallbackMiddleware` on the client there instead. Full semantics (sticky routing, billing, streaming, echoing fallback turns back): `shared/model-migration.md` → Migrating to Claude Fable 5 → `refusal` stop reason.
---
## Cost Optimization Strategies

View File

@@ -29,7 +29,7 @@ for await (const event of stream) {
const stream = client.messages.stream({
model: "claude-opus-4-8",
max_tokens: 64000,
thinking: { type: "adaptive" },
thinking: { type: "adaptive", display: "summarized" }, // display opt-in: default is omitted (empty thinking text) on Fable 5 / Mythos 5 / Opus 4.8 / 4.7
messages: [{ role: "user", content: "Analyze this problem" }],
});

View File

@@ -208,9 +208,9 @@ const response = await client.messages.create({
---
## Server-Side Tools
## Anthropic-Defined Tools
Version-suffixed `type` literals; `name` is fixed per interface. Pass plain object literals — the `ToolUnion` type is satisfied structurally. **The `name`/`type` pair must match the interface**: mixing `str_replace_based_edit_tool` (20250728 name) with `text_editor_20250124` (which expects `str_replace_editor`) is a TS2322.
Version-suffixed `type` literals; `name` is fixed per interface. Web search and code execution are server-executed; bash and text editor are client-executed (you handle the `tool_use` locally — see `shared/tool-use-concepts.md`). Pass plain object literals — the `ToolUnion` type is satisfied structurally. **The `name`/`type` pair must match the interface**: mixing `str_replace_based_edit_tool` (20250728 name) with `text_editor_20250124` (which expects `str_replace_editor`) is a TS2322.
**Don't type-annotate as `Tool[]`**`Tool` is just the custom-tool variant. Let structural typing infer from the `tools` param, or annotate as `Anthropic.Messages.ToolUnion[]` if you must:
@@ -525,3 +525,24 @@ const response = await client.messages.create({
],
});
```
---
## Agent Skills
Enable an Anthropic-managed skill (e.g., `pptx`) via `container.skills` + the `code_execution` tool on the beta path. Both beta headers are required. Outputs land as files in the response content — download by file ID via the Files API.
```typescript
const response = await client.beta.messages.create({
model: "claude-opus-4-8",
max_tokens: 16000,
container: {
skills: [{ type: "anthropic", skill_id: "pptx", version: "latest" }],
},
tools: [{ type: "code_execution_20260521", name: "code_execution" }],
betas: ["code-execution-2025-08-25", "skills-2025-10-02"],
messages: [{ role: "user", content: "Create a 3-slide deck about X." }],
});
// Find the file_id in response.content, then:
// await client.beta.files.download(fileId)
```