← Back to blog

One Orchestrator, Many Machines

Every agent workflow has two roles: the orchestrator that decides, and the box where work runs. Split them and one driver runs many machines.

You point Claude Code at a feature, watch it work, review the diff, and move to the next one. One agent, one task, one machine: yours. The agent is fast, but you are running it like a single-lane road.

Now picture three features going at once. One agent wiring up Stripe checkout, another rebuilding the onboarding flow, a third adding retries to a webhook handler. Each on its own machine, none of them able to step on the others, all reporting back to you as they finish. You stopped writing code and started reviewing pull requests as they land.

That is the shift. And the trick to it is a single idea you are already using without naming it.

Two roles hiding in every workflow

Every agent workflow, from a one-line command to a fleet of machines, is built from two roles.

The orchestrator decides what work happens and in what order. It holds your credentials, drives the loop, and owns the machine's life: it forges the box and burns it when the work is done.

The execution box is where the work actually runs. A real Linux server with your code, your tools, and root. It runs commands, holds the worktree, and deletes itself when the timer ends.

When you run Claude Code locally, those two roles are squashed onto one machine: your laptop is both the orchestrator and the box. That is why it feels single-lane. The moment you pull the roles apart, the orchestrator can drive more than one box, and the lane count goes up.

Why pulling them apart works

The orchestrator forges each box and burns it at the end, on success and on failure. That ownership is the whole reason the roles separate cleanly. A process cannot reliably forge and burn the box it is running on. If the orchestrator lived inside the box, it would be deleting the machine it was executing on, unable to clean up, report a result, or react when the agent inside breaks something.

So the orchestrator sits outside the box and outlives it. The box can be wiped, filled with junk, or wedged by a runaway install, and the orchestrator survives to forge another. It is the same relationship a CI runner has with the container it builds: the runner creates and destroys the container, it never runs inside it.

Once that boundary is clean, "drive one box" and "drive twenty" are the same code with a loop around it.

Forging them in parallel

Here is the simplest version. Three branches, three machines, each with a coding agent pre-installed, all forged in parallel:

gibil branch --agent claude --ttl 2h \
  feat/payments feat/onboarding feat/retries

About ninety seconds later you have three real servers, each on its own branch with the repo cloned and claude installed. They have separate kernels, separate IPs, and no shared state. Worker number two cannot see worker number one's files or processes.

From there your orchestrator, which is just you plus a terminal agent today, hands each box a task and runs it:

gibil run feat-payments "cd /root/project && claude -p 'wire up stripe checkout'" --json
gibil run feat-onboarding "cd /root/project && claude -p 'rebuild the welcome flow'" --json
gibil run feat-retries "cd /root/project && claude -p 'add exponential backoff'" --json

When you are done reviewing, burn them all at once:

gibil destroy --all --yes

If your orchestrator speaks MCP, the shape is the same with typed tools instead of shell strings. The agent forges each box with the branch and worker agent baked in, then dispatches each task as a background job and polls for it:

create_server({ name: "feat-payments", repo: "github.com/you/proj", branch: "feat/payments", agent: "claude", ttl: 120, env: { ANTHROPIC_API_KEY: "sk-ant-..." } })
// ...feat-onboarding, feat-retries the same way...

vm_bash({ server: "feat-payments", command: "claude -p 'wire up stripe checkout'", background: true })  // returns a job_id
vm_job_status({ job_id: "j-abc123" })   // poll until the exit code comes back
destroy_server({ name: "feat-payments" })

Either way, the picture is one driver and N workers:

                 ┌──────────────┐   ← orchestrator (you + an agent, or a script)
                 │  forge x3    │
                 └──┬────┬────┬─┘
           ┌────────┘    │    └────────┐
           ▼             ▼             ▼
     ┌──────────┐  ┌──────────┐  ┌──────────┐
     │  box A   │  │  box B   │  │  box C   │
     │ payments │  │ onboard  │  │ retries  │
     └──────────┘  └──────────┘  └──────────┘

The orchestrator does not have to be your laptop

Here is where it gets interesting. The orchestrator is a role, not a place. It can run on your laptop, on a CI runner, on a long-lived server, or on another gibil box.

That matters because whatever drives the loop has to stay powered. Run the orchestrator on your laptop and the work pauses when the lid closes. The boxes keep running, and they still delete themselves at their TTL so nothing leaks, but the loop stops until you come back.

Move the orchestrator somewhere that stays awake, a CI job or a small gibil box of its own, and the work runs untethered. You kick it off, close the laptop, and come back to a stack of branches waiting for review. The agents kept working in a room you were not standing in.

This is the version that changes how a day feels. The bottleneck is no longer "how fast does one agent run." It is "how fast can you review what three or five or ten of them produced."

What this is, and what it is not

Gibil gives you the boxes, forged in parallel, isolated, and self-deleting. It does not give you a magic "build ten features" button. The orchestration, deciding what each box does, polling for results, picking the winning diff, is your agent's job or a few lines of your own script. Today the untethered, run-it-on-a-box version is a pattern you wire up, not a single flag.

A few honest edges. Boot is real time: roughly ninety seconds per box, in parallel, not instant. Private repos need a GITHUB_TOKEN in the environment or the clone fails quietly. Each machine costs real money while it runs, billed to your own cloud account, which is pennies per box-hour but not zero. And the more boxes you forge, the more your own attention becomes the constraint. A run that produces ten diffs an hour is only useful if you can review ten diffs an hour.

If you would rather not hand-roll the loop at all, a framework like Sandcastle can run a managed agent loop on a gibil box for you. Sandcastle does the orchestration, gibil is the machine underneath. Same two roles, someone else's loop.

The shape of the work changes

The single-agent workflow trains you to wait. You give the agent a task, you watch, you review, you give the next task. The agent is the fast part and you are the throughput limit, but only because everything is serialized through one machine.

Pull the orchestrator off the box and the serialization goes away. You become a reviewer of parallel work instead of a babysitter of one process. The agents get their own machines, which are gone when the work is done, and you get your attention back to spend on the part that actually needs a human: deciding which of the diffs is right.

One orchestrator. Many machines. Each one yours for exactly as long as the work takes.

Where to go next