PR Review with Test Results
Review pull requests with actual test results, not static analysis
An agent checks out a PR branch, builds the project, runs the full test suite, and reviews with real pass/fail data, not static analysis.
Works with any MCP-capable agent: Claude Code, Cursor, Cline. Examples use Claude because that's what we test against.
With your agent (MCP)
With gibil's MCP server wired up (setup), hand your orchestrator the PR:
Review PR #42 on github.com/you/project. Check it out on a fresh VM, run the
full suite, and give me a review backed by actual test results, not a read-through.The agent forges a box, checks out the PR ref, and runs the suite via MCP tools:
create_server({ name: "review", repo: "github.com/you/project", ttl: 30 })
vm_bash({ server: "review", command: "git fetch origin pull/42/head:pr-42 && git checkout pr-42" })
vm_bash({ server: "review", command: "pnpm install && pnpm test" }) // → { exit_code: 0, stdout: "..." }
destroy_server({ name: "review" })The review then says "I ran the tests and they pass," not "the code looks correct."
By hand (CLI)
gibil create --name review --repo github.com/you/project --ttl 30m --json
gibil run review "cd /root/project && git fetch origin pull/42/head:pr-42 && git checkout pr-42" --json
gibil run review "cd /root/project && pnpm install && pnpm test" --json
# → {"stdout": "...", "stderr": "", "exit_code": 0}
gibil destroy reviewVariation: dependency upgrade impact
Forge several VMs and test multiple upgrades in parallel, and report which ones break the build. Via MCP, the orchestrator fans out create_server calls; by hand, use fleet mode:
gibil create --name upgrade --fleet 3 --repo github.com/you/project --ttl 30m --json
gibil run upgrade-1-abc "cd /root/project && npm install react@19 && pnpm test" --json
gibil run upgrade-2-abc "cd /root/project && npm install next@15 && pnpm test" --json
gibil run upgrade-3-abc "cd /root/project && npm install typescript@6 && pnpm test" --json
gibil destroy --allEach upgrade runs on a clean server. No cross-contamination.
Why gibil
- Evidence-based reviews: the agent ran the code, not just read it
- Clean server: test results reflect the PR in isolation, not leftover state
- Fan-out for comparisons: test N upgrades or N consumer projects in parallel
Set GITHUB_TOKEN to let the agent push branches and open PRs directly from the server. Private repos also need it for the initial clone. See Remote PR Workflow.
Next steps
- AI Agent via MCP: wire up the MCP server
- Run Agents in Parallel: N PRs at once
- Parallel Test Sharding: fleet mode for larger test suites
- Remote PR Workflow: push and open PRs from servers