Agent Templates — Rules for Modification
These rules apply ONLY when modifying agent templates in this repository. An "agent template" is any top-level directory with the agent- prefix, excluding TypeScript/JavaScript templates (e.g., agent-langchain-ts).
CRITICAL: Sync After Modifying Shared Sources
Shared files are copied from source-of-truth directories into each template. Never manually edit synced copies (e.g. {template}/quickstart.py, {template}/start_app.py, {template}/evaluate_agent.py) — always edit the source file in .scripts/source/ or .claude/skills/ first, then run the sync command to propagate changes to all templates.
| Source directory | Sync command | What it syncs |
|---|---|---|
.scripts/source/ | uv run python .scripts/sync-scripts.py | quickstart.py, start_app.py, evaluate_agent.py |
.claude/skills/ | uv run python .scripts/sync-skills.py | All skill directories under {template}/.claude/skills/ |
After modifying any source file, run the corresponding sync command and commit the synced copies.
Template Registry
.scripts/templates.py is the canonical registry — see TEMPLATES dict for the full list of templates, their SDKs, and bundle names. All sync scripts and e2e tests import from it.
To add a new template: add its entry to TEMPLATES in .scripts/templates.py, create the directory, then run both sync commands.
Template Conventions
Handler naming
All handlers live in agent_server/agent.py. Use @invoke() and @stream() decorators from mlflow.genai.agent_server:
@invoke()
async def invoke_handler(request: ResponsesAgentRequest) -> ResponsesAgentResponse: ...
@stream()
async def stream_handler(request: ResponsesAgentRequest) -> AsyncGenerator[ResponsesAgentStreamEvent, None]: ...
LangGraph invoke_handler delegates to stream_handler. OpenAI SDK invoke_handler calls Runner.run() independently.
MLflow autologging
- LangGraph templates:
mlflow.langchain.autolog() - OpenAI SDK templates:
mlflow.openai.autolog()+set_trace_processors([])
All handlers tag traces with: mlflow.update_current_trace(metadata={"mlflow.trace.session": session_id})
MCP server initialization
- LangGraph:
DatabricksMultiServerMCPClientwrappingDatabricksMCPServerobjects. Tools fetched once at agent init via.get_tools(). - OpenAI SDK:
McpServerused as async context manager per-request:async with await init_mcp_server() as mcp_server:
Session/memory patterns
| Pattern | SDK | Class | ID field |
|---|---|---|---|
| Short-term memory | LangGraph | AsyncCheckpointSaver | thread_id via config["configurable"] |
| Long-term memory | LangGraph | AsyncDatabricksStore | user_id via config["configurable"] |
| Short-term memory | OpenAI | AsyncDatabricksSession | session_id passed to Runner.run(..., session=) |
All memory templates return the ID in custom_outputs so clients can reuse it.
databricks.yml conventions
bundle.nameuses underscores:agent_langgraph- App
nameuses hyphens:agent-langgraph - Memory template app names are abbreviated (
-stm,-ltm) to stay within the 30-char limit - App command:
["uv", "run", "start-app"] - Lakebase resources use
permission: 'CAN_CONNECT_AND_CREATE' - Lakebase templates use
<your-lakebase-instance-name>as placeholder — quickstart replaces it - Templates do not define a DAB-managed experiment resource (
resources.experiments); instead, the app resource references an experiment by ID (initially empty), and quickstart fills in the literal experiment ID
app.yaml files
The 3 base templates (agent-langgraph, agent-openai-agents-sdk, agent-non-conversational) have app.yaml files for UI-based template creation in the Databricks UI. These are separate from databricks.yml and use valueFrom (camelCase) for resource references. Memory/multiagent variants do not need separate app.yaml files.
Per-template AGENTS.md
Each template has its own {template}/AGENTS.md (loaded via {template}/CLAUDE.md). These are maintained individually and are not synced — they contain template-specific guidance for end users.
Tests
Quickstart unit tests
The quickstart unit tests (test_quickstart.py) live in .scripts/source/ but have no pyproject.toml of their own. Run them from .scripts/agent-integration-tests/ which has pytest installed:
cd .scripts/agent-integration-tests && uv run pytest ../source/test_quickstart.py -v
E2E integration tests
Tests live in .scripts/agent-integration-tests/. Run from that directory.
Default runs both local and deploy. Only add --skip-deploy or --skip-local when the user explicitly asks.
CRITICAL: Never pipe test output through head, tail, or other truncating commands. This sends SIGPIPE to pytest, killing it before cleanup and leaving template files dirty. Always run pytest directly and use run_in_background to check on it later.
# DEFAULT: All templates in parallel (local + deploy)
uv run pytest test_e2e.py -v -n 7
# Single template (still runs both local + deploy)
uv run pytest test_e2e.py -v --template agent-langgraph
# Sequential with live output (for debugging)
uv run pytest test_e2e.py -v -n0 -s --template agent-langgraph
Template test configs are in .scripts/agent-integration-tests/template_config.py.
Editing Workflow Summary
- Changing a shared script (
quickstart.py,start_app.py,evaluate_agent.py) — edit in.scripts/source/, runuv run python .scripts/sync-scripts.py - Changing a skill — edit in
.claude/skills/, runuv run python .scripts/sync-skills.py - Changing template-specific agent code — edit directly in
{template}/agent_server/ - Adding a new template — add to
.scripts/templates.py, create directory, run both sync commands - Changing
databricks.yml— edit directly in the template (not synced) - After any change — run e2e tests:
cd .scripts/agent-integration-tests && uv run pytest test_e2e.py -v -n 7 --skip-deploy - After any change — review this file (
.claude/AGENTS.md) and each affected template'sAGENTS.mdfor inaccuracies, then update them to reflect the new state