Friday, May 8, 2026

I Built a Hub-Worker Setup from Scratch with the CLI — Then Compared It to the Managed Agents API

Anthropic released the Managed Agents API, so I put it side-by-side with my own Hub-Worker setup — and the architecture was almost identical. Running 8 Claude Code CLI instances in parallel under tmux and wiring it all together by hand, only to find it matched Anthropic's official infrastructure — genuinely strange feeling.

This article compares my self-built setup against these three features:

  1. Managed Agents API — a cloud platform that manages agent definitions, environments, sessions, and events end-to-end
  2. Multiagent sessions — a multi-agent execution layer where a Coordinator orchestrates multiple Workers in parallel or in sequence (Research Preview)
  3. Dreams — an async job that reads past session logs and automatically reorganizes and rebuilds a memory store (Research Preview)

My self-built setup

Quick overview of my current setup for reference — full details in this article.

  • 8 Claude Code CLI instances managed in tmux sessions
  • Hub instance decomposes tasks and delegates to Worker instances via GitHub Issues
  • Worker triggers are tmux send-keys-based scripts
  • Governance rules are written down in POLICY.md, which each instance references autonomously
Self-built Hub-Worker setup
User (takeyasu)
tmux send-keys
Hub (Claude Code CLI)
tmux: cl_orchestrator · governance via POLICY.md
▼ GitHub Issues + tmux send-keys trigger
BlogGen
ImgGen
ClaudeChat
… × 5
▼ read/write
WSL2 filesystem / Git repositories

Managed Agents API — the designs turned out almost identical

My first thought reading the Managed Agents API docs was "wait, I built this." Mapping the core concepts against my own setup:

Managed Agents API concept Self-built equivalent
Agent (model, system prompt, tool definitions) Each CLI instance with its role defined via CLAUDE.md + settings.local.json
Environment (container, packages, network config) WSL2 + per-repo .venv and path config
Session (a running agent instance executing a task) A Claude Code process inside a tmux session
Events (communication between app and agent) Writing to GitHub Issues + tmux send-keys triggers
Managed Agents API setup
Your App
▼ Events API (SSE)
Session (primary thread)
Coordinator Agent
defines model / system prompt / tools / MCP
▼ session threads (max 25) · 1-level delegation only
Worker
Agent 1
Worker
Agent 2
Worker
Agent 3
… up to
20 types
▼ read/write
Shared container
filesystem
Memory Store
(persistent memory)

The correspondence table and diagrams line up almost perfectly. The implementation differences, however, are significant. Pros & Cons:

Dimension Managed Agents API Self-built (Claude Code CLI)
Out-of-the-box tools Must define everything yourself Bash/Read/Write/MCP all come pre-loaded ✅
Linux ecosystem access Requires separate design Can autonomously run apt install and config changes ✅
Observability Real-time visibility into all threads via event stream ✅ Watch tmux output manually or via Monitor
Infrastructure ops Anthropic-managed. No environment maintenance needed ✅ WSL upkeep required — but restoreable from git
Cost Per-token API billing. Risk of costs spiking under heavy load Fixed monthly Max plan. Flat rate no matter how heavy the use ✅
Migration cost CLI assets can't be ported. Full rewrite required None (status quo) ✅

Multiagent sessions — problems the official version solves

Reading the Multiagent sessions docs clarified several places where my self-built setup falls short.

Dropped-ball detection. Each agent runs in a session thread — a context-isolated event stream — and state changes surface as events like session.thread_status_idle / session.thread_status_running that aggregate into the primary thread. In my self-built setup, a state where nobody throws the next ball fires nothing at all — that was the biggest blind spot. The official version solves this structurally.

Centralized tool confirmations. When a child agent is waiting for approval, requires_action is cross-posted to the primary thread. In my setup, confirmation requests are scattered across instances and easy to miss.

Using {"type": "self"} also lets the Coordinator fan out parallel copies of itself. Thread limits, delegation depth, and access requirements are summarized below.

Dimension Multiagent sessions Self-built (tmux + GitHub Issues)
Dropped-ball detection Structurally detected via thread_status_idle A state where nobody throws is silent. Requires manual watching or polling
Tool confirmation aggregation requires_action centralized in primary thread ✅ Scattered per instance. Easy to miss
Context continuity Threads persist. Prior context carries over on follow-up instructions ✅ Same (carries over as long as the tmux session is alive) ✅
Delegation flexibility Fixed at 1 level (depth enforced by the platform) Flexible — change it by updating POLICY.md ✅
Scale ceiling Hard cap: 25 threads, 20 agent types Unlimited within hardware and Max plan constraints ✅
Access requirements Research Preview — separate access request required Available right now ✅

Dreams — automated memory curation

Dreams is an async batch feature that reorganizes and rebuilds a memory store by reading past session logs (Research Preview, access request required). It never modifies the input store, so you can discard the output if you don't like it.

Key specs:

  • Input: existing memory store (required) + up to 100 past sessions (optional)
  • The instructions parameter lets you direct the curation (up to 4,096 characters — e.g. "focus on coding preferences, ignore temporary debug notes")
  • Supported models: claude-opus-4-7 / claude-sonnet-4-6
  • Billing: standard API token billing, proportional to the number and length of input sessions
  • Requires two beta headers: managed-agents-2026-04-01,dreaming-2026-04-21

In my self-built setup, memory updates are manual — and as the number of instances grows, it becomes impossible to keep up. The fact that the docs explicitly say "you can start from an empty store and feed only session logs" means incremental adoption is genuinely practical.

Dimension Dreams Self-built (manual memory management)
Execution trigger Runs automatically on a cron or on demand. Works without the user present ✅ Only when I ask "please learn this"
Scale Same flow regardless of instance count ✅ Can't keep up once instances multiply
Safety Input store untouched. Review output before swapping in ✅ Direct overwrites — unintended edits are hard to detect
Cost Scales with session volume. Risk of costs spiking under heavy load Stays within Max plan (no extra cost) ✅
Time to completion Minutes to tens of minutes (async job) Immediate, within the conversation ✅
Access requirements Research Preview — extra headers + access request required Available right now ✅

What building it yourself taught me

Having built it from scratch meant that when I read the Managed Agents API docs, the "why behind the design" clicked immediately. The 1-level delegation cap, thread persistence, aggregation into the primary thread — every one of those is a direct answer to a problem I hit while building my own version.

"Being able to use generative AI" is becoming table stakes. The next differentiator, I think, is — drawing responsibility boundaries between agents, context design, failure detection and recovery design — experience that can only be built by assembling something, breaking it, and learning from it. Same reason the early cloud architects who stood out weren't the ones who "knew AWS" — they were the ones who could explain how on-prem failure patterns translate into cloud avoidance strategies.

For now, with my self-built setup in place, the flexibility and flat-rate cost outweigh the cost of migrating to the API. But when the time comes to add significantly more Workers or normalize 24-hour unattended operation, that's when I'll seriously consider the move.

Wrap-up

Put a self-built CLI Hub-Worker setup next to the Managed Agents API and the designs are nearly identical. Where Multiagent sessions clearly wins: structural dropped-ball detection and centralized tool confirmations. Dreams is the answer to a memory management scaling problem that manual approaches can't keep up with — and the ability to start incrementally from an empty store is a big deal.

If this was useful, I'd love it if you shared it on X (Twitter).

FAQ

Q. Is the Managed Agents API available right now?

A. Yes — it's enabled by default for API accounts (beta header managed-agents-2026-04-01 required). Multiagent sessions and Dreams are Research Preview and require a separate access request.

Q. Is migrating from a self-built setup to the Managed Agents API painful?

A. CLI assets (Bash scripts, MCP servers, policy files) can't be ported, so it's effectively a full rewrite. If you migrate, treat it as a new project and redesign from scratch.

Q. Can I use the Managed Agents API on a Max plan?

A. The Managed Agents API is pay-per-token. Usage costs stack on top of the Max plan subscription. For Dreams, since billing scales with session volume, test with a small batch before ramping up to production scale.

Note: Information in this article is current as of May 2026. The Managed Agents API is a beta feature and specs may change. See the Anthropic official docs for the latest.

App by the author of this blog

I made an iOS reading management app called My Bookstore. Simple bookshelf management — give it a try.

View on App Store →

Related articles

References

Note: This article is part of an automated blog update experiment using Claude Code.

No comments:

Post a Comment