Agent Code-Test Loop

An AI agent writes code, runs tests, reads failures, fixes, and repeats — on a real server

An AI agent writes code, runs tests, reads failures, fixes code, repeats. Five to twenty iterations on a real server until everything passes.

This is the core use case. It works today with zero caveats.

Workflow

# Forge a server with your repo cloned and ready
gibil create --name pr-42 --repo github.com/you/project --ttl 60

# Agent loops: run tests → parse output → fix code → repeat
gibil run pr-42 "cd /root/project && pnpm test" --json
# → {"stdout": "3 failed, 39 passed", "stderr": "", "exit_code": 1}

# Agent fixes code via MCP or gibil run, then runs again
gibil run pr-42 "cd /root/project && pnpm test" --json
# → {"stdout": "42 passed", "stderr": "", "exit_code": 0}

# Done — burn it
gibil destroy pr-42

What the agent gets

A real Linux server with:

Your repo cloned to /root/project
Runtime installed (Node, Python, or Go via .gibil.yml)
Root access — no permission issues
SSH for the full TTL window

The --json flag returns structured output the agent can parse without regex:

{
  "stdout": "3 failed, 39 passed",
  "stderr": "",
  "exit_code": 1
}

Why gibil

Long-lived session — the server stays up for the full TTL, not just one command
Clean state — fresh Ubuntu 24.04, no leftover artifacts from previous runs
Machine-readable — --json on every command
Auto-cleanup — TTL burns the server when the agent is done (or forgets)

For MCP-native agents (Claude Code, Cursor, or any MCP-compatible agent), pair this with MCP mode — the agent gets typed tools (vm_bash, vm_write) instead of shell strings.

Next steps

Claude Code MCP — connect Claude Code directly via MCP
CLI: gibil run — full command reference
CLI: gibil create — all creation flags

Agent Code-Test Loop

Workflow

What the agent gets

Why gibil

Next steps

On this page