Agent DailyAgent Daily
tutorialbeginner

Managed Agents tutorial: iterate on a failing test suite Apr 2026 • Agent Patterns Tools Entry-point tutorial for the Claude Managed Agents API. Walks through agent / environment / session creation, file mounts, and the streaming event loop by getting an agent to fix three planted bugs in a calc.py package.

cookbook
View original on cookbook

This tutorial introduces the Claude Managed Agents API by walking through a practical debugging workflow where an agent iteratively fixes failing tests in a Python package. It covers the three core resources (Agent, Environment, Session), file mounting, and the streaming event loop pattern. The example demonstrates how agents autonomously discover the iterate-observe-fix loop by running tests, reading failures, editing code, and repeating until all assertions pass.

Key Points

  • Three core resources: Agent (reusable config with model/prompt/tools), Environment (container template), and Session (binds agent+environment, mounts files, produces event stream)
  • Create agent once with system prompt and toolset (agent_toolset_20260401 includes bash, read, write, edit, glob, grep, web_fetch, web_search)
  • Set permission_policy to 'always_allow' to let agents execute tools without confirmation round-trips
  • Upload files via Files API and mount them read-only under /mnt/session/uploads/<mount_path>; agent must copy to /mnt/user or /tmp to edit
  • Open SSE stream first before sending events to avoid race conditions and ensure all events are observable
  • Exit streaming loop on session.status_idle with stop_reason.type == 'end_turn' (not when waiting for custom tool responses)
  • Agent learns iterate pattern autonomously: run tests → read traceback → edit code → rerun → repeat until green
  • Downstream test failures (like test_mean depending on add/divide fixes) teach agents not to over-fix by fixing root causes
  • Verify results independently by re-running assertions after agent completes, catching any regressions or over-fixes
  • Use wait_for_idle_status helper before archiving to handle race condition where session.status field briefly lags behind stream status

Found this useful? Add it to a playbook for a step-by-step implementation guide.

Workflow Diagram

Start Process
Step A
Step B
Step C
Complete
Quality

Concepts

Artifacts (4)

agent_creationpythonscript
agent = client.beta.agents.create(
    name="cookbook-iterate",
    model=MODEL,
    system=(
        "You are a debugging agent. Your job is to make failing tests pass. "
        "Run the tests, read the failures, fix the code, repeat until green. "
        "Stop when every assertion passes."
    ),
    tools=[
        {
            "type": "agent_toolset_20260401",
            "default_config": {
                "enabled": True,
                "permission_policy": {
                    "type": "always_allow"
                },
            },
        }
    ],
)
environment_creationpythonscript
env = client.beta.environments.create(
    name="cookbook-iterate-env",
    config={
        "type": "cloud",
        "networking": {
            "type": "limited"
        }
    },
)
session_creationpythonscript
session = client.beta.sessions.create(
    environment_id=env.id,
    agent={
        "type": "agent",
        "id": agent.id,
        "version": agent.version
    },
    resources=[
        {
            "type": "file",
            "file_id": calc_file.id,
            "mount_path": "calc.py"
        },
        {
            "type": "file",
            "file_id": test_file.id,
            "mount_path": "test_calc.py"
        },
    ],
    title="Get the tests green",
)
streaming_event_looppythonscript
with client.beta.sessions.events.stream(session.id) as stream:
    client.beta.sessions.events.send(
        session_id=session.id,
        events=[
            {
                "type": "user.message",
                "content": [
                    {
                        "type": "text",
                        "text": (
                            "The tests in /mnt/session/uploads/test_calc.py are "
                            "failing. Copy both files into /mnt/user, iterate "
                            "on calc.py until every test passes, then write the "
                            "final calc.py to /mnt/session/outputs/calc.py. "
                            "pytest isn't installed here, run the assertions "
                            "directly with `python3 -c ...` instead."
                        ),
                    }
                ],
            }
        ],
    )
    print("--- iterate loop ---")
    for ev in stream:
        match ev.type:
            case "agent.message":
                for b in ev.content:
                    if b.type == "text":
                        print(b.text, end="")
            case "agent.tool_use":
                print(f"\n[{ev.name}]")
            case "session.status_idle" if ev.stop_reason and ev.stop_reason.type == "end_turn":
                break
            case "session.status_terminated":
                break
Managed Agents tutorial: iterate on a failing test suite Apr 2026 • Agent Patterns Tools Entry-point tutorial for the Claude Managed Agents API. Walks through agent / environment / session creation, file mounts, and the streaming event loop by getting an agent to fix three planted bugs in a calc.py package. | Agent Daily