← Back to blog

VMs vs Sandboxes vs Cloud Sessions: Picking the Right Compute for AI Agents

Four architectural options for giving your AI agent a machine to work on. Each makes a different tradeoff.

When your AI agent needs to run your full stack, build your project, or execute a test suite, it needs compute somewhere. You have four architectural options — and picking the wrong one is expensive, slow, or both.

Option 1: Anthropic cloud sessions (built in)

If you use Claude Code, the --remote flag or the web interface give you a managed VM backed by Anthropic: 4 vCPU, 16 GB RAM, Docker, PostgreSQL, Redis, environment caching. Included in your subscription.

Best for: Most workloads. If your project fits in 4 vCPU / 16 GB and you're fine with Anthropic's infrastructure, this is the obvious starting point — it's already paid for.

Limitations: Resource ceiling is fixed. Sessions share quota with your normal Claude usage. Code runs on Anthropic-managed infrastructure, which matters for proprietary or regulated codebases. Parallel sessions burn your daily Routines allocation fast.

Option 2: Sandboxes (E2B, Daytona, Blaxel)

Firecracker microVMs or Docker containers with sub-second boot times and SDK-first APIs. Products like E2B are used by Cursor, Perplexity, and Hugging Face to run user-submitted code snippets in isolation.

Best for: Companies building AI products who need to run short code executions at high volume. If you're embedding code execution into your own product, this is the right tool.

Not right for agent workstations: These tools are designed for API integration, not developer workstations. No persistent codebase, no SSH, 8 GB RAM cap, 24h session limit (E2B Pro). An AI agent doing a real build-test loop needs a full machine, not a snippet execution sandbox.

Option 3: GitHub Codespaces

A full VS Code environment in the browser backed by a cloud VM. The closest existing product to a developer workstation in the cloud.

Best for: Teams who work primarily in a browser IDE and are deep in the GitHub ecosystem (Actions, Copilot, Issues).

Limitations: $0.36/hr for 4-core compute (vs $0.03 on Hetzner). Persistent by default — costs accumulate even when stopped. Agent support is Copilot-only, GitHub-locked. Doesn't work with agents outside the GitHub ecosystem.

Option 4: Full VMs (Gibil)

A real Hetzner VM — dedicated kernel, public IP, Docker, SSH, and any package you need. Forged in ~30 seconds, destroyed automatically on TTL. Interacted with via CLI or MCP server. Works with any AI agent.

Best for: Agent sessions that need more than 4 vCPU / 16 GB, need Docker services (Postgres + Redis + your app running simultaneously), need to run longer than a subscription quota allows, or need code to stay on your own infrastructure. Agent-agnostic — any MCP-compatible agent works.

gibil create --name workstation --repo github.com/you/project --ttl 90m
# → Your agent connects via MCP and starts working
# → Full root, Docker, SSH, any Hetzner instance size
gibil destroy workstation
# → Gone, no orphaned resources

How to choose

Your situationBest option
Using Claude Code and project fits in 4 vCPU / 16 GBAnthropic cloud sessions
Building a product that runs user code snippets via APIE2B / sandbox
Team uses VS Code in browser, deep in GitHub ecosystemCodespaces
Need more compute, or code must stay on your infra, or need parallel sessionsGibil
Complex stack (Docker services + large repo + long test suite)Gibil
Using multiple agents or want agent-agnostic infrastructureGibil

The starting point

Anthropic's cloud sessions are the right default — they're free, they're built in, and they cover most cases.

Gibil is what you reach for when you hit the ceiling: more RAM, more CPU, more sessions, your own infrastructure — or when you want to use any agent, not just one vendor's cloud sessions.

Gibil doesn't care which agent you use. MCP in, SSH in, full machine out.