Coding agent targets evaluate AI coding assistants and CLI-based agents. These targets require a judge_target to run LLM-based evaluators.
workspace_template: ./workspace-templates/my-project
| Field | Required | Description |
|---|
workspace_template | No | Path to workspace template directory |
cwd | No | Working directory (mutually exclusive with workspace_template) |
judge_target | Yes | LLM target for evaluation |
workspace_template: ./workspace-templates/my-project
| Field | Required | Description |
|---|
workspace_template | No | Path to workspace template directory |
cwd | No | Working directory (mutually exclusive with workspace_template) |
judge_target | Yes | LLM target for evaluation |
provider: pi-coding-agent
workspace_template: ./workspace-templates/my-project
| Field | Required | Description |
|---|
workspace_template | No | Path to workspace template directory |
cwd | No | Working directory (mutually exclusive with workspace_template) |
judge_target | Yes | LLM target for evaluation |
workspace_template: ${{ WORKSPACE_PATH }}
| Field | Required | Description |
|---|
workspace_template | Yes | Path to workspace template directory |
judge_target | Yes | LLM target for evaluation |
provider: vscode-insiders
workspace_template: ${{ WORKSPACE_PATH }}
Same configuration as VS Code.
Evaluate any command-line agent:
command_template: 'python agent.py --prompt {PROMPT}'
workspace_template: ./workspace-templates/my-project
| Field | Required | Description |
|---|
command_template | Yes | Command to run. {PROMPT} is replaced with the input. |
workspace_template | No | Path to workspace template directory |
cwd | No | Working directory (mutually exclusive with workspace_template) |
judge_target | Yes | LLM target for evaluation |
For testing the evaluation harness without calling real providers: