name: generate-config description: Generate and validate mcpbr configuration files for MCP server benchmarking.
Instructions
You are an expert at creating valid mcpbr configuration files. Your goal is to help users create correct YAML configs for their MCP servers.
Critical Requirements
-
Always Include {workdir} Placeholder: The
argsarray MUST include"{workdir}"as a placeholder for the task repository path. This is CRITICAL - mcpbr replaces this at runtime with the actual working directory. -
Valid Commands: Ensure the
commandfield uses an executable that exists on the user's system:npxfor Node.js-based MCP serversuvxfor Python MCP servers via uvpythonorpython3for direct Python execution- Custom binaries (verify they exist with
which <command>)
-
Model Aliases: Use short aliases when possible:
sonnetinstead ofclaude-sonnet-4-5-20250929opusinstead ofclaude-opus-4-5-20251101haikuinstead ofclaude-haiku-4-5-20251001
-
Required Fields: Every config MUST have:
mcp_server.commandmcp_server.args(with"{workdir}")provider(usually"anthropic")agent_harness(usually"claude-code")modeldataset(or rely on benchmark default)
Common MCP Server Configurations
Anthropic Filesystem Server
mcp_server:
name: "filesystem"
command: "npx"
args:
- "-y"
- "@modelcontextprotocol/server-filesystem"
- "{workdir}"
env: {}
Custom Python MCP Server
mcp_server:
name: "my-server"
command: "uvx"
args:
- "my-mcp-server"
- "--workspace"
- "{workdir}"
env:
LOG_LEVEL: "debug"
Supermodel Codebase Analysis
mcp_server:
name: "supermodel"
command: "npx"
args:
- "-y"
- "@supermodeltools/mcp-server"
env:
SUPERMODEL_API_KEY: "${SUPERMODEL_API_KEY}"
Configuration Template
When generating a new config, use this template:
mcp_server:
name: "<server-name>"
command: "<executable>"
args:
- "<arg1>"
- "<arg2>"
- "{workdir}" # CRITICAL: Include this placeholder
env: {}
provider: "anthropic"
agent_harness: "claude-code"
model: "sonnet" # or "opus", "haiku"
dataset: "SWE-bench/SWE-bench_Lite" # or null to use benchmark default
sample_size: 5
timeout_seconds: 300
max_concurrent: 4
max_iterations: 30
Validation Steps
Before saving a config, validate:
- Workdir Placeholder: Ensure
"{workdir}"appears inargsarray. - Command Exists: Verify the command is available:
which npx # or uvx, python, etc. - Syntax: YAML syntax is correct (no tabs, proper indentation).
- Environment Variables: If using env vars like
${API_KEY}, remind user to set them.
Benchmark-Specific Configurations
SWE-bench (Default)
# ... mcp_server config ...
provider: "anthropic"
agent_harness: "claude-code"
model: "sonnet"
dataset: "SWE-bench/SWE-bench_Lite" # or SWE-bench/SWE-bench_Verified
sample_size: 10
CyberGym
# ... mcp_server config ...
provider: "anthropic"
agent_harness: "claude-code"
model: "sonnet"
benchmark: "cybergym"
dataset: "sunblaze-ucb/cybergym"
cybergym_level: 2 # 0-3
sample_size: 10
MCPToolBench++
# ... mcp_server config ...
provider: "anthropic"
agent_harness: "claude-code"
model: "sonnet"
benchmark: "mcptoolbench"
dataset: "MCPToolBench/MCPToolBenchPP"
sample_size: 10
Custom Agent Prompts
Users can customize the agent prompt using the agent_prompt field:
agent_prompt: |
Fix the following bug in this repository:
{problem_statement}
Make the minimal changes necessary to fix the issue.
Focus on the root cause, not symptoms.
Important: The {problem_statement} placeholder is required and will be replaced with the actual task description.
Common Mistakes to Avoid
- Missing {workdir}: Forgetting to include
"{workdir}"in args. - Hardcoded Paths: Never hardcode absolute paths like
/workspaceor/tmp/repo. - Invalid Commands: Using commands that don't exist (e.g.,
uvinstead ofuvx). - Wrong Indentation: YAML is whitespace-sensitive. Use 2 spaces, not tabs.
- Missing Quotes: Environment variable references like
"${VAR}"need quotes.
Example Workflow
When a user asks to create a config:
-
Ask about their MCP server:
- What package/command runs the server?
- Does it need any special arguments or environment variables?
- Is it Node.js-based (npx) or Python-based (uvx)?
-
Generate the config based on their answers.
-
Validate the config:
- Check for
{workdir}placeholder - Verify command exists
- Confirm YAML syntax
- Check for
-
Save the config (usually to
mcpbr.yaml). -
Optionally test the config with a small sample:
mcpbr run -c mcpbr.yaml -n 1 -v
Helpful Commands
# Generate a default config
mcpbr init
# List available models
mcpbr models
# List available benchmarks
mcpbr benchmarks
# Validate config by doing a dry run with 1 task
mcpbr run -c config.yaml -n 1 -v