name: cli-interactive-testing description: Test and validate DyGram machines using CLI interactive mode. Step through execution, provide intelligent responses, debug behavior, and create test recordings.
CLI Interactive Testing Skill
Execute and validate DyGram machines using CLI interactive mode for intelligent turn-by-turn testing.
Purpose
This skill guides you through using the CLI interactive mode to:
- Test machines by executing them step-by-step
- Debug behavior by observing state at each turn
- Provide intelligent responses when LLM decisions are needed
- Create test recordings for automated CI/CD playback
- Validate multiple scenarios (success, error, edge cases)
Quick Start
Basic Testing Workflow
# 1. Start interactive execution
dygram execute --interactive machine.dy --id test-01
# 2. Continue execution turn-by-turn
dygram execute --interactive machine.dy --id test-01
# 3. Check status at any time
dygram exec status test-01
# 4. Provide response when needed
echo '{"response": "Continue", "tools": [...]}' | \
dygram execute --interactive machine.dy --id test-01
Core Concepts
Turn-by-Turn Execution
Each CLI call executes one turn (one LLM invocation):
- State persists to disk (
.dygram/executions/<id>/) - Machine snapshot prevents definition changes mid-execution
- History logs all turns (
history.jsonl) - Auto-resumes from last state
Response Modes
1. Auto-continue (no stdin):
dygram e -i machine.dy --id test
Used for: Task nodes without LLM, simple transitions
2. Manual response (stdin):
echo '{"response": "...", "tools": [...]}' | dygram e -i machine.dy --id test
Used for: Agent nodes, complex decisions, testing specific paths
3. Playback mode (recordings):
dygram e -i machine.dy --playback recordings/golden/ --id test
Used for: Deterministic testing, CI/CD validation
Detailed Workflow
Step 1: Understand the Machine
Before testing, read and understand the machine:
# Read machine definition
cat machines/payment-workflow.dy
# Generate visualization
dygram generate machines/payment-workflow.dy --format html
# Validate syntax
dygram parseAndValidate machines/payment-workflow.dy
Step 2: Start Interactive Execution
Choose execution mode based on goal:
For debugging/exploration:
dygram e -i machines/payment-workflow.dy --id debug
For creating test recordings:
dygram e -i machines/payment-workflow.dy \
--record recordings/payment-workflow/ \
--id recording-001
For validating with existing recordings:
dygram e -i machines/payment-workflow.dy \
--playback recordings/payment-workflow/ \
--id playback-001
Step 3: Execute Turn-by-Turn
Continue execution, observing and providing input as needed:
# Execute next turn
dygram e -i machines/payment-workflow.dy --id debug
# Check what happened
dygram exec status debug
# View execution history
cat .dygram/executions/debug/history.jsonl | tail -5
# Check current state
cat .dygram/executions/debug/state.json | jq '.executionState.currentNode'
Step 4: Provide Intelligent Responses
When machine needs LLM decision, analyze and provide response:
# First, understand what's needed
cat .dygram/executions/debug/state.json | jq '.executionState.turnState'
# Provide thoughtful response
echo '{
"response": "Validating payment credentials",
"tools": [
{"name": "validate_payment", "params": {"amount": 100}}
]
}' | dygram e -i machines/payment-workflow.dy --id debug
Step 5: Continue Until Complete
# Option 1: Manual stepping
dygram e -i machines/payment-workflow.dy --id debug
dygram e -i machines/payment-workflow.dy --id debug
# ... until complete
# Option 2: Loop (with manual responses when needed)
while dygram e -i machines/payment-workflow.dy --id debug 2>&1 | \
grep -q "Turn completed"; do
echo "Turn completed, continuing..."
done
Step 6: Validate Results
# Check final status
dygram exec status debug
# Review full history
cat .dygram/executions/debug/history.jsonl
# Check final state
cat .dygram/executions/debug/state.json | jq '.status'
# If recording mode, verify recordings
ls -la recordings/payment-workflow/
Providing Intelligent Responses
Response Format
{
"response": "Your reasoning and explanation",
"tools": [
{
"name": "tool_name",
"params": {
"param1": "value1",
"param2": "value2"
}
}
]
}
Decision-Making Process
-
Analyze Context
- What node are we at?
- What tools are available?
- What is the task prompt asking for?
-
Understand Intent
- What is the machine trying to accomplish?
- What would a real agent do here?
- Are there multiple valid paths?
-
Choose Semantically
- Don't just pattern-match keywords
- Consider the machine's goal
- Test different scenarios (success/error/edge)
-
Document Reasoning
- Include clear explanation in response
- This helps understand recordings later
Example Responses
Simple continuation:
echo '{"action": "continue"}' | dygram e -i machine.dy --id test
File operation:
echo '{
"response": "Reading configuration file to determine environment",
"tools": [
{"name": "read_file", "params": {"path": "config.json"}}
]
}' | dygram e -i machine.dy --id test
Transition decision:
echo '{
"response": "Payment validation succeeded, transitioning to confirmation state",
"tools": [
{"name": "transition_to_confirmation", "params": {}}
]
}' | dygram e -i machine.dy --id test
Multiple tools:
cat <<'EOF' | dygram e -i machine.dy --id test
{
"response": "Analyzing data and generating report",
"tools": [
{"name": "read_file", "params": {"path": "data.json"}},
{"name": "analyze_data", "params": {"format": "summary"}},
{"name": "write_file", "params": {
"path": "report.txt",
"content": "Analysis complete"
}}
]
}
EOF
Testing Patterns
Pattern 1: Debug Single Execution
Step through to understand behavior:
# Start
dygram e -i machine.dy --id debug --verbose
# Step through with observation
for i in {1..10}; do
echo "=== Turn $i ==="
dygram e -i machine.dy --id debug
# Check state
dygram exec status debug
# Review last history entry
tail -1 .dygram/executions/debug/history.jsonl | jq '.'
# Pause for review
read -p "Continue? (y/n) " -n 1 -r
echo
[[ ! $REPLY =~ ^[Yy]$ ]] && break
done
Pattern 2: Create Golden Recording
# Start with recording
dygram e -i machine.dy \
--record recordings/golden-test/ \
--id golden
# Execute with intelligent responses
# (provide responses as machine requires them)
# Continue until complete
while dygram e -i machine.dy --id golden; do
echo "Turn completed"
done
# Verify recording
ls -la recordings/golden-test/
dygram e -i machine.dy \
--playback recordings/golden-test/ \
--id verify
# Commit to git
git add recordings/golden-test/
git commit -m "Add golden recording for machine"
Pattern 3: Test Multiple Scenarios
# Success path
dygram e -i machine.dy --record recordings/success/ --id success
# ... provide success responses ...
# Error path
dygram e -i machine.dy --record recordings/error/ --id error
# ... provide error responses ...
# Edge case
dygram e -i machine.dy --record recordings/edge/ --id edge
# ... provide edge case responses ...
# Validate all scenarios
for scenario in success error edge; do
echo "Testing $scenario..."
dygram e -i machine.dy \
--playback "recordings/$scenario/" \
--id "test-$scenario"
done
Pattern 4: Batch Test Multiple Machines
#!/bin/bash
for machine in machines/*.dy; do
name=$(basename "$machine" .dy)
echo "Testing: $name"
# Start with recording
dygram e -i "$machine" \
--record "recordings/$name/" \
--id "$name" \
--verbose 2>&1 | tee "logs/$name.log"
# Continue until complete or error
attempts=0
max_attempts=20
while [ $attempts -lt $max_attempts ]; do
if dygram e -i "$machine" --id "$name"; then
((attempts++))
else
echo "Completed or errored after $attempts turns"
break
fi
done
# Check result
if dygram exec status "$name" | grep -q "complete"; then
echo "✓ $name: SUCCESS"
else
echo "✗ $name: FAILED or INCOMPLETE"
fi
# Clean up
dygram exec rm "$name"
done
Pattern 5: Compare Before/After
Test behavior changes:
# Record baseline
git checkout main
dygram e -i machine.dy --record recordings/baseline/ --id baseline
# ... execute ...
# Record with changes
git checkout feature-branch
dygram e -i machine.dy --record recordings/feature/ --id feature
# ... execute ...
# Compare recordings
diff -u recordings/baseline/ recordings/feature/
# Validate both still work
dygram e -i machine.dy --playback recordings/baseline/ --id test-baseline
dygram e -i machine.dy --playback recordings/feature/ --id test-feature
Recording Management
Creating Recordings
Recordings capture LLM responses for deterministic replay:
dygram e -i machine.dy --record recordings/test-case/ --id test
Recording structure:
recordings/test-case/
├── turn-1.json # First LLM invocation
├── turn-2.json # Second LLM invocation
└── turn-3.json # Third LLM invocation
Recording content:
{
"request": {
"systemPrompt": "...",
"tools": [...]
},
"response": {
"content": [...],
"stop_reason": "tool_use"
}
}
Using Recordings
# Playback deterministically
dygram e -i machine.dy --playback recordings/test-case/ --id playback
# Continue playback
while dygram e -i machine.dy --id playback; do :; done
Organizing Recordings
Recommended structure:
recordings/
├── golden/ # Golden path tests
│ ├── basic-workflow/
│ ├── payment-flow/
│ └── approval-process/
├── edge-cases/ # Edge case scenarios
│ ├── empty-input/
│ ├── max-length/
│ └── special-chars/
├── error-handling/ # Error scenarios
│ ├── missing-file/
│ ├── invalid-data/
│ └── timeout/
└── regression/ # Regression tests
├── bug-123-fix/
├── bug-456-fix/
└── feature-789/
Maintaining Recordings
# Update recording when behavior intentionally changes
dygram e -i machine.dy \
--record recordings/golden/workflow/ \
--id update \
--force # Force new recording
# Validate all recordings still work
for dir in recordings/golden/*/; do
name=$(basename "$dir")
echo "Testing: $name"
dygram e -i "machines/$name.dy" \
--playback "$dir" \
--id "validate-$name"
done
State Management
Execution State Files
State is stored in .dygram/executions/<id>/:
.dygram/executions/test-01/
├── state.json # Current execution state
├── metadata.json # Execution metadata
├── machine.json # Machine snapshot (prevents mid-execution changes)
└── history.jsonl # Turn-by-turn history log
Inspecting State
# View current node
cat .dygram/executions/test-01/state.json | jq '.executionState.currentNode'
# View turn state (if in turn)
cat .dygram/executions/test-01/state.json | jq '.executionState.turnState'
# View visited nodes
cat .dygram/executions/test-01/state.json | jq '.executionState.visitedNodes'
# View attributes
cat .dygram/executions/test-01/state.json | jq '.executionState.attributes'
# View metadata
cat .dygram/executions/test-01/metadata.json | jq '.'
Managing Executions
# List all executions
dygram exec list
# Show specific execution status
dygram exec status test-01
# Remove execution
dygram exec rm test-01
# Clean completed executions
dygram exec clean
Troubleshooting
Execution Not Progressing
Check if waiting for input:
dygram exec status <id>
cat .dygram/executions/<id>/state.json | jq '.executionState.turnState'
Provide required response:
echo '{"response": "...", "tools": [...]}' | dygram e -i machine.dy --id <id>
Wrong Path Taken
Restart from beginning:
dygram exec rm <id>
dygram e -i machine.dy --id <id> --force
Or start new execution:
dygram e -i machine.dy --id <id>-retry
Recording Playback Mismatch
Check recording content:
ls -la recordings/test-case/
cat recordings/test-case/turn-1.json | jq '.'
Verify machine hasn't changed:
# Compare machine hashes
cat .dygram/executions/<id>/metadata.json | jq '.dyash'
Re-record if machine changed:
dygram e -i machine.dy --record recordings/test-case/ --id new --force
State Corruption
View error details:
cat .dygram/executions/<id>/state.json | jq '.status'
Force fresh start:
dygram exec rm <id>
dygram e -i machine.dy --id <id> --force
Best Practices
1. Always Use Explicit IDs
# Good: Explicit ID for tracking
dygram e -i machine.dy --id test-payment-success
# Avoid: Auto-generated IDs are hard to track
dygram e -i machine.dy
2. Create Recordings for Important Tests
# Record golden path
dygram e -i machine.dy --record recordings/golden/ --id golden
# Commit to git
git add recordings/golden/
git commit -m "Add golden recording for regression testing"
3. Use Verbose Mode for Debugging
dygram e -i machine.dy --id debug --verbose
4. Check State Frequently
# After each significant turn
dygram e -i machine.dy --id test
dygram exec status test
5. Clean Up Test Executions
# After testing
dygram exec rm test-01
dygram exec clean
6. Document Test Scenarios
# Create a test plan
cat > TEST_PLAN.md <<'EOF'
# Payment Workflow Tests
## Scenarios
1. Success path: recordings/payment-success/
2. Invalid card: recordings/payment-invalid/
3. Timeout: recordings/payment-timeout/
4. Retry success: recordings/payment-retry/
## Run Tests
for scenario in success invalid timeout retry; do
dygram e -i payment.dy \
--playback recordings/payment-$scenario/ \
--id test-$scenario
done
EOF
Integration with CI/CD
Local Development
# 1. Develop machine
vim machines/workflow.dy
# 2. Test interactively
dygram e -i machines/workflow.dy \
--record recordings/workflow/ \
--id workflow-test
# 3. Commit machine and recordings
git add machines/workflow.dy recordings/workflow/
git commit -m "Add workflow machine with tests"
CI Configuration
# .github/workflows/test.yml
name: Test DyGram Machines
on: [push, pull_request]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Install DyGram
run: npm install -g dygram
- name: Test All Machines
run: |
for recording in recordings/golden/*/; do
machine=$(basename "$recording")
echo "Testing: $machine"
dygram execute --interactive \
"machines/$machine.dy" \
--playback "$recording" \
--id "ci-$machine"
# Check result
if ! dygram exec status "ci-$machine" | grep -q "complete"; then
echo "FAILED: $machine"
exit 1
fi
echo "PASSED: $machine"
done
Summary Checklist
When testing a machine, ensure you:
- Read and understand the machine definition
- Start with explicit execution ID
- Use
--recordif creating test recordings - Step through execution observing state
- Provide intelligent responses when needed
- Check status frequently with
dygram exec status - Validate final state and results
- Verify recordings if created
- Clean up test executions when done
- Commit recordings for CI/CD if appropriate
See Also
- CLI Interactive Mode Guide:
docs/cli/interactive-mode.md - CLI Reference:
docs/cli/README.md - Agent:
dygram-test-responder(auto-loaded) - Examples:
examples/directory
Remember: You have intelligent reasoning - use it! Understand context, make semantic decisions, and test edge cases. Don't just pattern-match; think about what the machine is trying to accomplish.