name: devops-workflow-engineer description: > Use when designing GitHub Actions workflows, creating CI/CD pipelines, planning multi-environment deployments, optimizing pipeline cost and execution time, or implementing deployment strategies (blue-green, canary, rolling). Generates production-ready workflow YAML, analyzes existing pipelines for optimization, and creates deployment plans. license: MIT + Commons Clause metadata: version: 1.1.0 author: borghei category: engineering domain: devops updated: 2026-04-02 tags: [github-actions, ci-cd, deployment, workflows] python-tools: workflow_generator.py, pipeline_analyzer.py, deployment_planner.py tech-stack: python, github-actions, yaml, ci-cd
DevOps Workflow Engineer
The agent generates GitHub Actions workflow YAML, analyzes existing pipelines for optimization opportunities, and creates deployment plans with strategy selection, health checks, and rollback procedures.
Quick Start
# Generate a CI workflow
python scripts/workflow_generator.py --type ci --language python --test-framework pytest
# Analyze existing pipelines for optimization
python scripts/pipeline_analyzer.py .github/workflows/ --format json
# Plan a deployment strategy
python scripts/deployment_planner.py --type webapp --environments dev,staging,prod --strategy canary
Tools Overview
| Tool | Input | Output |
|---|---|---|
workflow_generator.py | Workflow type + language | GitHub Actions YAML (ci, cd, release, security-scan, docs-check) |
pipeline_analyzer.py | Workflow file or directory | Optimization findings, cost estimates, severity ratings |
deployment_planner.py | Project type + environments | Deployment plan with strategy, health checks, rollback |
All tools support --format json and --output for file writing.
Workflow 1: CI Pipeline Design
The agent generates pipelines following fail-fast ordering:
- Lint and format (~30s) -- cheapest gate first
- Unit tests (~2-5m) -- matrix across versions
- Build verification (~3-8m)
- Integration tests (~5-15m, parallel with build)
- Security scanning (~2-5m)
jobs:
lint:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: make lint
test:
needs: lint
strategy:
matrix:
python-version: ['3.10', '3.11', '3.12']
steps:
- uses: actions/setup-python@v5
with: { python-version: "${{ matrix.python-version }}", cache: pip }
- run: pip install -r requirements.txt
- run: pytest --junitxml=results.xml
security:
needs: lint
steps:
- run: pip-audit -r requirements.txt
CI targets:
| Metric | Target | Fix |
|---|---|---|
| Total CI time | < 10 min | Parallelize, add caching |
| Lint step | < 1 min | Use pre-commit locally |
| Unit tests | < 5 min | Split suites, use matrix |
| Flaky rate | < 1% | Quarantine flaky tests |
| Cache hit rate | > 80% | Review cache keys |
Workflow 2: CD Pipeline and Multi-Environment Deployment
python scripts/deployment_planner.py --type webapp --environments dev,staging,prod --format json
Environment promotion flow:
Build -> Dev (auto) -> Staging (auto) -> Production (manual approval)
|
Canary (10%) -> Full rollout
| Aspect | Dev | Staging | Production |
|---|---|---|---|
| Trigger | Every push | Merge to main | Manual approval |
| Replicas | 1 | 2 | 3+ (auto-scaled) |
| Secrets | Repository | Environment | Vault/OIDC |
| Monitoring | Basic logs | Full observability | Full + alerting |
Key CD rules:
- Build once, deploy the same artifact everywhere
- Tag artifacts with commit SHA for traceability
- Use environment protection rules for production gates
- Maintain rollback capability at every stage
Workflow 3: Pipeline Optimization
python scripts/pipeline_analyzer.py .github/workflows/ --format json -o report.json
The agent checks for:
- Missing caching -- dependencies reinstalled every run
- No timeouts -- stuck jobs burn budget
- Sequential chains that could parallelize
- Deprecated actions with newer versions available
- Security issues -- secrets in logs, missing permissions scoping
- Cost inefficiency -- oversized runners, no path filtering
Optimization techniques:
Path-based filtering -- skip CI for docs-only changes:
on:
push:
paths: ['src/**', 'tests/**', 'requirements*.txt']
paths-ignore: ['docs/**', '*.md']
Concurrency cancellation -- cancel superseded runs:
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
Dependency caching:
- uses: actions/cache@v4
with:
path: ~/.cache/pip
key: ${{ runner.os }}-deps-${{ hashFiles('**/requirements.txt') }}
Deployment Strategies
Decision tree:
Zero-downtime required?
No -> Rolling deployment
Yes -> Need instant rollback?
No -> Rolling with health checks
Yes -> Budget for 2x infrastructure?
Yes -> Blue-green
No -> Canary
Canary traffic split schedule:
| Phase | % | Duration | Gate |
|---|---|---|---|
| 1 | 5% | 15 min | Error rate < 0.1% |
| 2 | 25% | 30 min | P99 latency < 200ms |
| 3 | 50% | 60 min | Business metrics stable |
| 4 | 100% | -- | Full promotion |
GitHub Actions Patterns
Reusable workflows -- define once, call everywhere:
# .github/workflows/reusable-deploy.yml
on:
workflow_call:
inputs:
environment: { required: true, type: string }
image_tag: { required: true, type: string }
secrets:
DEPLOY_KEY: { required: true }
OIDC authentication -- no long-lived credentials:
permissions:
id-token: write
contents: read
steps:
- uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: arn:aws:iam::123456789:role/github-actions
aws-region: us-east-1
Secrets hierarchy: Organization > Repository > Environment. Never echo secrets; use add-mask for dynamic values. Prefer OIDC for cloud auth.
Runner Cost Optimization
| Runner | vCPU | RAM | Cost/min | Best For |
|---|---|---|---|---|
| 2-core | 2 | 7 GB | $0.008 | Standard tasks |
| 4-core | 4 | 16 GB | $0.016 | Build-heavy |
| 8-core | 8 | 32 GB | $0.032 | Large compilations |
| 16-core | 16 | 64 GB | $0.064 | Parallel test suites |
Monthly estimate: (runs/day) x (avg min/run) x 30 x (cost/min)
Example: 50 pushes/day x 8 min x 30 = 12,000 min x $0.008 = $96/month.
Anti-Patterns
| Anti-Pattern | Problem | Fix |
|---|---|---|
| Monolithic workflow | 45-min single workflow | Split into parallel jobs |
| No caching | Reinstall deps every run | Cache dependencies and builds |
| Secrets in logs | Leaked credentials | add-mask, avoid echo |
| No timeout | Stuck jobs burn budget | timeout-minutes on every job |
| Full matrix every push | 30-min matrix on every commit | Full nightly; reduced on push |
| No rollback plan | Stuck with broken deploy | Automate rollback in CD pipeline |
Troubleshooting
| Problem | Cause | Solution |
|---|---|---|
| Workflow never triggers | Wrong on: config or branch name mismatch | Verify triggers match branching strategy |
| Cache miss every run | Volatile cache key (timestamp) | Use hashFiles() on lock files |
| Matrix fails on one OS only | Platform-specific paths or deps | Use shell: bash; install OS deps per matrix entry |
| Secret not available | Wrong environment scope | Ensure job declares correct environment: |
| Health check fails after deploy | App not started before check | Add retry loop with backoff |
| Concurrency cancels needed runs | Overly broad group key | Scope to workflow-ref; separate groups for deploy |
References
| Guide | Path |
|---|---|
| GitHub Actions Patterns | references/github-actions-patterns.md |
| Deployment Strategies | references/deployment-strategies.md |
| Agentic Workflows Guide | references/agentic-workflows-guide.md |
Integration Points
| Skill | Integration |
|---|---|
release-orchestrator | Release workflows align with versioning and changelog |
senior-devops | Deployment strategies complement infra automation |
senior-secops | Security scanning steps feed SecOps dashboards |
senior-qa | CI quality gates map to QA acceptance criteria |
incident-commander | Rollback procedures connect to incident playbooks |
Last Updated: April 2026 Version: 1.1.0