tutorialbeginner
Managed Agents tutorial: iterate on a failing test suite Apr 2026 • Agent Patterns Tools Entry-point tutorial for the Claude Managed Agents API. Walks through agent / environment / session creation, file mounts, and the streaming event loop by getting an agent to fix three planted bugs in a calc.py package.
cookbook
View original on cookbookThis tutorial introduces the Claude Managed Agents API by walking through a practical debugging workflow where an agent iteratively fixes failing tests in a Python package. It covers the three core resources (Agent, Environment, Session), file mounting, and the streaming event loop pattern. The example demonstrates how agents autonomously discover the iterate-observe-fix loop by running tests, reading failures, editing code, and repeating until all assertions pass.
Key Points
- •Three core resources: Agent (reusable config with model/prompt/tools), Environment (container template), and Session (binds agent+environment, mounts files, produces event stream)
- •Create agent once with system prompt and toolset (agent_toolset_20260401 includes bash, read, write, edit, glob, grep, web_fetch, web_search)
- •Set permission_policy to 'always_allow' to let agents execute tools without confirmation round-trips
- •Upload files via Files API and mount them read-only under /mnt/session/uploads/<mount_path>; agent must copy to /mnt/user or /tmp to edit
- •Open SSE stream first before sending events to avoid race conditions and ensure all events are observable
- •Exit streaming loop on session.status_idle with stop_reason.type == 'end_turn' (not when waiting for custom tool responses)
- •Agent learns iterate pattern autonomously: run tests → read traceback → edit code → rerun → repeat until green
- •Downstream test failures (like test_mean depending on add/divide fixes) teach agents not to over-fix by fixing root causes
- •Verify results independently by re-running assertions after agent completes, catching any regressions or over-fixes
- •Use wait_for_idle_status helper before archiving to handle race condition where session.status field briefly lags behind stream status
Found this useful? Add it to a playbook for a step-by-step implementation guide.
Workflow Diagram
Start Process
Step A
Step B
Step C
Complete
Concepts
Artifacts (4)
agent_creationpythonscript
agent = client.beta.agents.create(
name="cookbook-iterate",
model=MODEL,
system=(
"You are a debugging agent. Your job is to make failing tests pass. "
"Run the tests, read the failures, fix the code, repeat until green. "
"Stop when every assertion passes."
),
tools=[
{
"type": "agent_toolset_20260401",
"default_config": {
"enabled": True,
"permission_policy": {
"type": "always_allow"
},
},
}
],
)environment_creationpythonscript
env = client.beta.environments.create(
name="cookbook-iterate-env",
config={
"type": "cloud",
"networking": {
"type": "limited"
}
},
)session_creationpythonscript
session = client.beta.sessions.create(
environment_id=env.id,
agent={
"type": "agent",
"id": agent.id,
"version": agent.version
},
resources=[
{
"type": "file",
"file_id": calc_file.id,
"mount_path": "calc.py"
},
{
"type": "file",
"file_id": test_file.id,
"mount_path": "test_calc.py"
},
],
title="Get the tests green",
)streaming_event_looppythonscript
with client.beta.sessions.events.stream(session.id) as stream:
client.beta.sessions.events.send(
session_id=session.id,
events=[
{
"type": "user.message",
"content": [
{
"type": "text",
"text": (
"The tests in /mnt/session/uploads/test_calc.py are "
"failing. Copy both files into /mnt/user, iterate "
"on calc.py until every test passes, then write the "
"final calc.py to /mnt/session/outputs/calc.py. "
"pytest isn't installed here, run the assertions "
"directly with `python3 -c ...` instead."
),
}
],
}
],
)
print("--- iterate loop ---")
for ev in stream:
match ev.type:
case "agent.message":
for b in ev.content:
if b.type == "text":
print(b.text, end="")
case "agent.tool_use":
print(f"\n[{ev.name}]")
case "session.status_idle" if ev.stop_reason and ev.stop_reason.type == "end_turn":
break
case "session.status_terminated":
break