Skip to content

Targets Configuration

Targets define which agent or LLM provider to evaluate. They are configured in .agentv/targets.yaml to decouple eval files from provider details.

targets:
- name: azure_base
provider: azure
endpoint: ${{ AZURE_OPENAI_ENDPOINT }}
api_key: ${{ AZURE_OPENAI_API_KEY }}
model: ${{ AZURE_DEPLOYMENT_NAME }}
- name: vscode_dev
provider: vscode
workspace_template: ${{ WORKSPACE_PATH }}
judge_target: azure_base
- name: local_agent
provider: cli
command_template: 'python agent.py --prompt {PROMPT}'
judge_target: azure_base

Use ${{ VARIABLE_NAME }} syntax to reference values from your .env file:

targets:
- name: my_target
provider: anthropic
api_key: ${{ ANTHROPIC_API_KEY }}
model: ${{ ANTHROPIC_MODEL }}

This keeps secrets out of version-controlled files.

ProviderTypeDescription
azureLLMAzure OpenAI
anthropicLLMAnthropic Claude API
geminiLLMGoogle Gemini
claude-codeAgentClaude Code CLI
codexAgentCodex CLI
pi-coding-agentAgentPi Coding Agent
vscodeAgentVS Code with Copilot
vscode-insidersAgentVS Code Insiders
cliAgentAny CLI command
mockTestingMock provider for dry runs

Set the default target at the top level or override per case:

# Top-level default
execution:
target: azure_base
evalcases:
- id: test-1
# Uses azure_base
- id: test-2
execution:
target: vscode_dev # Override for this case

Agent targets that need LLM-based evaluation specify a judge_target — the LLM used to run LLM judge evaluators:

targets:
- name: codex_target
provider: codex
judge_target: azure_base # LLM used for judging

For agent targets, workspace_template specifies a directory that gets copied to a temporary location before each eval case runs. This provides isolated, reproducible workspaces.

targets:
- name: claude_code
provider: claude-code
workspace_template: ./workspace-templates/my-project
judge_target: azure_base

When workspace_template is set:

  • The template directory is copied to ~/.agentv/workspaces/<eval-run-id>/<case-id>/
  • The .git directory is skipped during copy
  • Each eval case gets its own isolated copy

By default:

  • Success: Workspace is cleaned up automatically
  • Failure: Workspace is preserved for debugging

Override with CLI flags:

  • --keep-workspaces: Always preserve workspaces
  • --cleanup-workspaces: Always clean up, even on failure
OptionUse Case
cwdRun in an existing directory (shared across cases)
workspace_templateCopy template to temp location (isolated per case)

These options are mutually exclusive. If neither is set, the eval file’s directory is used as the working directory.