* Update claude-api skill with Managed Agents guidance * Replace OPUS_ID placeholder with concrete model string in claude-api skill * Replace remaining model placeholders with concrete model names and IDs
7.3 KiB
Managed Agents — Overview
Managed Agents provisions a container per session as the agent's workspace. The agent loop runs on Anthropic's orchestration layer; the container is where the agent's tools execute — bash commands, file operations, code. You create a persisted Agent config (model, system prompt, tools, MCP servers, skills), then start Sessions that reference it. The session streams events back to you; you send user messages and tool results in.
⚠️ THE MANDATORY FLOW: Agent (once) → Session (every run)
Why agents are separate objects: versioning. An agent is a persisted, versioned config — every update creates a new immutable version, and sessions pin to a version at creation time. This lets you iterate on the agent (tweak the prompt, add a tool) without breaking sessions already running, roll back if a change regresses, and A/B test versions side-by-side. None of that works if you agents.create() fresh on every run.
Every session references a pre-created /v1/agents object. Create the agent once, store the ID, and reuse it across runs.
| Step | Call | Frequency |
|---|---|---|
| 1 | POST /v1/agents — model, system, tools, mcp_servers, skills live here |
ONCE. Store agent.id and agent.version. |
| 2 | POST /v1/sessions — agent: "agent_abc123" or {type: "agent", id, version} |
Every run. String shorthand uses latest version. |
If you're about to write sessions.create() with model, system, or tools on the session body — stop. Those fields live on agents.create(). The session takes a pointer only.
When generating code, separate setup from runtime. agents.create() belongs in a setup script (or a guarded if agent_id is None: block), not at the top of the hot path. If the user's code calls agents.create() on every invocation, they're accumulating orphaned agents and paying the create latency for nothing. The correct shape is: create once → persist the ID (config file, env var, secrets manager) → every run loads the ID and calls sessions.create().
To change the agent's behavior, use POST /v1/agents/{id} — don't create a new one. Each update bumps the version; running sessions keep their pinned version, new sessions get the latest (or pin explicitly via {type: "agent", id, version}). See shared/managed-agents-core.md → Agents → Versioning.
Beta Headers
Managed Agents is in beta. The SDK sets required beta headers automatically:
| Beta Header | What it enables |
|---|---|
managed-agents-2026-04-01 |
Agents, Environments, Sessions, Events, Session Resources, Vaults, Credentials |
skills-2025-10-02 |
Skills API (for managing custom skill definitions) |
files-api-2025-04-14 |
Files API for file uploads |
Note: do not intermix beta headers — If you need to upload a skill or file via the Skills API or Files API you will need to use the appropriate beta header as listed above. However, you do NOT need to inlude either the Skills or Files beta header when using any of the Managed Agents endpints listed in row 1 above. Do NOT include intermix beta headers and prefer to use the Skills or Files beta headers when using their specific endpoints.
Reading Guide
| User wants to... | Read these files |
|---|---|
| Get started from scratch / "help me set up an agent" | shared/managed-agents-onboarding.md — guided interview (WHERE→WHO→WHAT→WATCH), then emit code |
| Understand how the API works | shared/managed-agents-core.md |
| See the full endpoint reference | shared/managed-agents-api-reference.md |
| Create an agent (required first step) | shared/managed-agents-core.md (Agents section) + language file |
| Update/version an agent | shared/managed-agents-core.md (Agents → Versioning) — update, don't re-create |
| Create a session | shared/managed-agents-core.md + {lang}/managed-agents/README.md |
| Configure tools and permissions | shared/managed-agents-tools.md |
| Set up MCP servers | shared/managed-agents-tools.md (MCP Servers section) |
| Stream events / handle tool_use | shared/managed-agents-events.md + language file |
| Set up environments | shared/managed-agents-environments.md + language file |
| Upload files / attach repos | shared/managed-agents-environments.md (Resources) |
| Store MCP credentials | shared/managed-agents-tools.md (Vaults section) |
Common Pitfalls
- Agent FIRST, then session — NO EXCEPTIONS — the session's
agentfield accepts only a string ID or{type: "agent", id, version}.model,system,tools,mcp_servers,skillsare top-level fields onPOST /v1/agents, never onsessions.create(). If the user hasn't created an agent, that is step zero of every example. - Agent ONCE, not every run —
agents.create()is a setup step. Store the returnedagent_idand reuse it; don't callagents.create()at the top of your hot path. If the agent's config needs to change,POST /v1/agents/{id}— each update creates a new version, and sessions can pin to a specific version for reproducibility. - MCP auth goes through vaults — the agent's
mcp_serversarray declares{type, name, url}only (no auth). Credentials live in vaults (client.beta.vaults.credentials.create) and attach to sessions viavault_ids. Anthropic auto-refreshes OAuth tokens using the stored refresh token. - Stream to get events —
GET /v1/sessions/{id}/events/streamis the primary way to receive agent output in real-time. - SSE stream has no replay — reconnect with consolidation — if the stream drops while a
agent.tool_use,agent.mcp_tool_use, oragent.custom_tool_useis pending resolution (user.tool_confirmationfor the first two,user.custom_tool_resultfor the last one), the session deadlocks (client disconnects → session idles → reconnect happens → no client resolution happens). On every (re)connect: open stream withGET /v1/sessions/{id}/events/stream, fetchGET /v1/sessions/{id}/events, dedupe by event ID, then proceed. Seeshared/managed-agents-events.md→ Reconnecting after a dropped stream. - Don't trust HTTP-library timeouts as wall-clock caps —
requeststimeout=(c, r)andhttpx.Timeout(n)are per-chunk read timeouts; they reset every byte, so a trickling connection can block indefinitely. For a hard deadline on raw-HTTP polling, tracktime.monotonic()at the loop level and bail explicitly. Prefer the SDK'ssessions.events.stream()/session.events.list()over hand-rolled HTTP. Seeshared/managed-agents-events.md→ Receiving Events. - Messages queue — you can send events while the session is
runningoridle; they're processed in order. No need to wait for a response before sending the next message. - Cloud environments only —
config.type: "cloud"is the only supported environment type.