Why AI Agents Need Servers, Not Sandboxes
Sandboxes cap at 4 vCPU with no root and limited sessions. Agents doing real work need full machines.
Your AI agent just cloned a monorepo, installed 400 npm packages, started Postgres, ran 2,000 tests, and pushed a fix to GitHub.
Try doing that in a sandbox.
The sandbox ceiling
Most agent compute platforms give you a container dressed up as something fancier. They call it a sandbox, a microVM, a secure execution environment. The constraints are the same:
- 4-8 vCPU cap. Your agent can't parallelize a heavy build.
- No root access. Can't install system packages, configure services, or touch systemd.
- No Docker. The agent runs inside a container — it can't run containers itself.
- No SSH. Can't drop in to debug when something goes wrong.
- Session limits. 1-24 hours, then everything is gone.
These limits exist because the platform shares infrastructure across tenants. Root would be a security disaster. Docker-in-Docker would be a resource nightmare.
For simple tasks — run a script, check an output — sandboxes work. But agents are getting more capable, and their tasks are getting heavier.
What agents actually do
An AI coding agent doesn't just execute a script. It:
- Clones your repo — needs git, SSH keys, maybe a GitHub token
- Installs dependencies — npm, pip, apt packages, system libraries
- Starts services — Postgres, Redis for integration tests
- Builds the project — webpack, cargo, go build — CPU and memory hungry
- Runs tests — sometimes thousands, sometimes in parallel
- Pushes code — git commit, git push, opens a PR
Each step is a system-level operation. Not a function call. A real process running on a real machine.
The server advantage
A full server gives the agent the same environment a developer has:
gibil create --name agent-task --repo github.com/you/project --ttl 60Now the agent has:
- Root access — install anything, configure anything
- Real networking — a public IP, real ports
- Docker — run services alongside the code
- SSH — the agent (or you) can drop in anytime
- No resource caps — pick the server size you need
The agent works on a real machine. When it's done, the machine disappears. No cleanup, no stale state, no forgotten servers running up a bill.
The tradeoff: boot time
Sandboxes are faster to start. A Firecracker microVM boots in ~150ms. A gibil server takes 30-120 seconds.
For a 5-second script execution, the sandbox wins. For a 30-minute test-fix-test cycle, boot time is noise.
The question isn't "which boots faster." It's "what does the agent need to do?" If the answer involves Docker, SSH, root, or serious CPU — the sandbox ceiling will hit before the task finishes.
When to use what
Use a sandbox when:
- The task is a single script execution (under 5 minutes)
- No Docker, no system packages, no services needed
- Latency matters more than capability
Use a server when:
- The agent needs to build, test, or deploy
- Docker services are part of the workflow
- The task runs for more than a few minutes
- You need SSH for debugging
- The agent needs root
Most AI agent workloads — code generation, test execution, CI, infrastructure testing — are server workloads. The tooling is catching up.