Build a Feature with Sandcastle
Drop gibil in as a Sandcastle sandbox provider. Same agent loop, real VM instead of a local container.
Sandcastle is an agent loop framework: write a prompt, the agent iterates against your repo until tests pass, then opens a PR. Out of the box it runs in local Docker. Swap one import and it runs on a real Hetzner VM instead.
- import { docker } from "@ai-hero/sandcastle/sandboxes/docker";
+ import { gibil } from "@gibil/sandcastle-provider";Setup
npm install --save-dev @gibil/sandcastle-provider @ai-hero/sandcastle
npm install -g gibil
gibil init # one-time: paste your Hetzner API tokenWrite the prompt
# .sandcastle/prompt.md
Add a dark mode toggle to the Settings page.
- Add a `theme` field to user preferences (light | dark | system).
- Persist via the existing `usePreferences()` hook.
- Apply the class on `<html>` so Tailwind's `dark:` variants pick it up.
- Update the existing settings test to cover both themes.
Pass when `pnpm test` is green and `pnpm tsc --noEmit` is clean.Run it on a gibil VM
// run.ts
import { run, claudeCode } from "@ai-hero/sandcastle";
import { gibil } from "@gibil/sandcastle-provider";
await run({
agent: claudeCode("claude-opus-4-7"),
sandbox: gibil({ ttl: "1h" }), // VM auto-destroys at the 1h mark
promptFile: ".sandcastle/prompt.md",
branchStrategy: { type: "branch", name: "feat/dark-mode" },
});tsx run.ts
# → [gibil] Provisioning VM... ready in 96.2s
# → [agent] Iteration 1: writing files, running pnpm install
# → [agent] Iteration 2: pnpm test failed, reading output, fixing
# → [agent] Iteration 3: pnpm test passed, opening PR
# → [gibil] Destroying VM... done in 0.9sThe agent field takes any model Sandcastle supports: claudeCode(...) here, but swap in whichever adapter you use. The gibil sandbox is agent-agnostic; it just provides the VM the loop runs on.
Why a VM, not a container
docker() provider | gibil() provider | |
|---|---|---|
| Agent runs on | Your laptop | A remote VM |
| Isolation | Bind-mount, shared kernel | Separate kernel, IP, disk |
| If the agent breaks the env | Your host pays | The VM dies, forge another |
| Long install / build / test | Heats your laptop | Fans stay quiet |
Cleanup if you Ctrl+C | Container may survive | TTL still kills the VM |
| Cost (cax11, idle) | $0 | ~$0.008/hr |
Use docker() for sub-minute, low-friction loops. Use gibil() when the agent needs heavy compute, a clean kernel, or a session that outlives your terminal.
The provider implements every method on Sandcastle's IsolatedSandboxHandle: exec, stdin/stdout streaming, copyIn (file + directory), copyFileOut, env propagation, cwd override. Your existing hooks, maxIterations, and idleTimeoutSeconds work unchanged.
Cost in practice
A typical 5–10 iteration feature loop runs 15–25 minutes on cax11. ~€0.003 of compute, billed to your own Hetzner account. No third-party sandbox markup, no per-call fees. The VM auto-destroys at the TTL boundary even if the agent loop, your terminal, or your laptop dies.
Next steps
- Sandcastle integration README: full options, hooks, branch-strategy notes
- Code-test loop: same shape without the Sandcastle framework
- Parallel test sharding: fan multiple gibil VMs out for one test suite