Agent Code-Test Loop
An AI agent writes code, runs tests, reads failures, fixes, and repeats — on a real server
An AI agent writes code, runs tests, reads failures, fixes code, repeats. Five to twenty iterations on a real server until everything passes.
This is the core use case. It works today with zero caveats.
Workflow
# Forge a server with your repo cloned and ready
gibil create --name pr-42 --repo github.com/you/project --ttl 60
# Agent loops: run tests → parse output → fix code → repeat
gibil run pr-42 "cd /root/project && pnpm test" --json
# → {"stdout": "3 failed, 39 passed", "stderr": "", "exit_code": 1}
# Agent fixes code via MCP or gibil run, then runs again
gibil run pr-42 "cd /root/project && pnpm test" --json
# → {"stdout": "42 passed", "stderr": "", "exit_code": 0}
# Done — burn it
gibil destroy pr-42What the agent gets
A real Linux server with:
- Your repo cloned to
/root/project - Runtime installed (Node, Python, or Go via
.gibil.yml) - Root access — no permission issues
- SSH for the full TTL window
The --json flag returns structured output the agent can parse without regex:
{
"stdout": "3 failed, 39 passed",
"stderr": "",
"exit_code": 1
}Why gibil
- Long-lived session — the server stays up for the full TTL, not just one command
- Clean state — fresh Ubuntu 24.04, no leftover artifacts from previous runs
- Machine-readable —
--jsonon every command - Auto-cleanup — TTL burns the server when the agent is done (or forgets)
For MCP-native agents (Claude Code, Cursor, or any MCP-compatible agent), pair this with MCP mode — the agent gets typed tools (vm_bash, vm_write) instead of shell strings.
Next steps
- Claude Code MCP — connect Claude Code directly via MCP
- CLI:
gibil run— full command reference - CLI:
gibil create— all creation flags