name: e2e-deploy-rhdh description: Deploy RHDH to an OpenShift cluster using local-run.sh for E2E test execution, with autonomous error recovery for deployment failures
Deploy RHDH
Deploy Red Hat Developer Hub to a cluster for E2E test execution using the existing local-run.sh workflow.
When to Use
Use this skill when you need a running RHDH instance to reproduce and fix a test failure.
Prerequisites
Before running the deployment, verify these tools are installed:
# Required tools (local-run.sh checks these automatically)
podman --version # Container runtime
oc version # OpenShift CLI
kubectl version --client # Kubernetes CLI
vault --version # HashiCorp Vault (for secrets)
jq --version # JSON processor
curl --version # HTTP client
rsync --version # File sync
bc --version # Calculator (for resource checks)
Podman Machine Requirements
The podman machine must be running with adequate resources:
podman machine inspect | jq '.Resources'
# Requires: >= 8GB RAM, >= 4 CPUs
If resources are insufficient:
podman machine stop
podman machine set --memory 8192 --cpus 4
podman machine start
Deployment Using local-run.sh
The primary deployment method uses e2e-tests/local-run.sh, which handles everything:
Vault authentication, cluster service account setup, RHDH deployment, and test execution.
Execution Rules
CRITICAL — deployment is a long-running operation:
- Never run
local-run.shin the background. Operator installations can take 20-30 minutes. Use the Bash tool withtimeout: 600000(10 minutes) and if it times out, check the container log — do NOT assume failure. - Before starting a deployment, check for existing containers:
If a deployment container is already running, wait for it to finish instead of starting a new one. Monitor via the container log:podman ps --format "{{.Names}} {{.Status}}" | grep -i rhdh-e2e-runnertail -f e2e-tests/.local-test/container.log - Never launch concurrent deployments. Two deployments to the same cluster will race and both fail. If a deployment appears stuck, check the container log and cluster state before deciding it failed.
- How to detect actual failure vs slow progress: The operator install script outputs detailed debug logs. If the container log shows active progress (timestamps advancing), the deployment is still running. Only consider it failed if:
- The podman container has exited (
podman psshows no running container) - AND the container log shows an error message (e.g., "Failed install RHDH Operator")
- The podman container has exited (
CLI Mode (Preferred)
CRITICAL: CLI mode requires all three flags (-j, -r, -t). If -r is omitted, the script falls into interactive mode and will hang in automated contexts.
cd e2e-tests
./local-run.sh -j <full-prow-job-name> -r <image-repo> -t <image-tag> [-s]
Example — OCP job (deploy-only with -s):
cd e2e-tests
./local-run.sh -j periodic-ci-redhat-developer-rhdh-main-e2e-ocp-v4-20-helm-nightly -r rhdh-community/rhdh -t next -s
Example — K8s job (AKS/EKS/GKE) (full execution, no -s):
cd e2e-tests
./local-run.sh -j periodic-ci-redhat-developer-rhdh-main-e2e-eks-helm-nightly -r rhdh-community/rhdh -t next
Parameters:
-j / --job: The full Prow CI job name extracted from the Prow URL. Theopenshift-ci-tests.shhandler uses bash glob patterns (like*ocp*helm*nightly*) to match, so the full name works correctly. Example:periodic-ci-redhat-developer-rhdh-main-e2e-ocp-v4-20-helm-nightly-r / --repo: Image repository (required for CLI mode — without it the script enters interactive mode)-t / --tag: Image tag (e.g.,1.9,next)-s / --skip-tests: Deploy only, skip test execution. OCP jobs only — K8s jobs (AKS, EKS, GKE) do not support this flag and require the full execution pipeline
WARNING: Do NOT use shortened job names like nightly-ocp-helm for -j — these do not match the glob patterns in openshift-ci-tests.sh.
Image Selection
Refer to the e2e-fix-workflow rule for the release branch to image repo/tag mapping table.
Deploy-Only Mode (OCP Jobs Only)
For OCP jobs, deploy without running tests so you can run specific tests manually:
./local-run.sh -j <full-prow-job-name> -r <image-repo> -t <tag> -s
Note: K8s jobs (AKS, EKS, GKE) do not support deploy-only mode. They require the full execution pipeline — run without -s.
What local-run.sh Does
- Validates prerequisites: Checks all required tools and podman resources
- Verifies the image: Checks the image exists on quay.io via the Quay API
- Pulls the runner image:
quay.io/rhdh-community/rhdh-e2e-runner:main - Authenticates to Vault: OIDC-based login for secrets
- Sets up cluster access: Creates
rhdh-local-testerservice account with cluster-admin, generates 48h token - Copies the repo: Syncs the local repo to
.local-test/rhdh/(excludes node_modules) - Runs a Podman container: Executes
container-init.shinside the runner image, which:- Fetches all Vault secrets to
/tmp/secrets/ - Logs into the cluster
- Sets platform-specific environment variables
- Runs
.ci/pipelines/openshift-ci-tests.shfor deployment
- Fetches all Vault secrets to
Post-Deployment: Setting Up for Manual Testing
After local-run.sh completes (with -s for OCP jobs, or after full execution for K8s jobs), set up the environment for headed Playwright testing:
# Source the test setup (choose 'showcase' or 'rbac')
source e2e-tests/local-test-setup.sh showcase
# or
source e2e-tests/local-test-setup.sh rbac
This exports:
BASE_URL— The RHDH instance URLK8S_CLUSTER_URL— Cluster API server URLK8S_CLUSTER_TOKEN— Fresh service account token- All Vault secrets as environment variables
Verify RHDH is accessible:
curl -sSk "$BASE_URL" -o /dev/null -w "%{http_code}"
# Should return 200
Deployment Error Recovery
Common Deployment Failures
CrashLoopBackOff
Symptoms: Pod repeatedly crashes and restarts.
Investigation:
# Check pod status
oc get pods -n <namespace>
# Check pod logs
oc logs -n <namespace> <pod-name> --previous
# Check events
oc get events -n <namespace> --sort-by=.lastTimestamp
Common causes and fixes:
- Missing ConfigMap: The app-config ConfigMap wasn't created → check
.ci/pipelines/resources/config_map/for the correct template - Bad plugin configuration: A dynamic plugin is misconfigured → check
dynamic-plugins-configConfigMap against.ci/pipelines/resources/config_map/dynamic-plugins-config.yaml - Missing secrets: Required secrets not mounted → verify secrets exist in the namespace
- Node.js errors: Check for JavaScript errors in logs that indicate code issues
ImagePullBackOff
Investigation:
oc describe pod -n <namespace> <pod-name> | grep -A5 "Events"
Common causes:
- Image doesn't exist: Verify on quay.io:
curl -s 'https://quay.io/api/v1/repository/rhdh/rhdh-hub-rhel9/tag/?filter_tag_name=like:<tag>' - Pull secret missing: Check
namespace::setup_image_pull_secretin.ci/pipelines/lib/namespace.sh - Registry auth: Ensure the pull secret has correct credentials
Helm Install Failure
Investigation:
helm list -n <namespace>
helm status <release-name> -n <namespace>
Common causes:
- Values file error: Check merged values against
.ci/pipelines/value_files/values_showcase.yaml - Chart version mismatch: Verify chart version with
helm::get_chart_versionfrom.ci/pipelines/lib/helm.sh
Operator Deployment Failure
Investigation:
oc get backstage -n <namespace>
oc describe backstage <name> -n <namespace>
oc get csv -n <namespace> # Check operator subscription status
Common causes:
- Backstage CR misconfigured: Compare against
.ci/pipelines/resources/rhdh-operator/rhdh-start.yaml - Operator not installed: Check CatalogSource and Subscription
- CRD not ready: Wait for CRD with
k8s_wait::crdpattern from.ci/pipelines/lib/k8s-wait.sh
Cross-Repo Investigation
When deployment issues stem from the operator or chart, search the relevant repos using whichever tool is available. Try them in this order and use the first one that works:
- Sourcebot (if available): search
rhdh-operatorandrhdh-chartrepos for specific error patterns or configuration keys - Context7 (if available): query
redhat-developer/rhdh-operatororredhat-developer/rhdh-chartfor docs and code snippets - Fallback —
gh search code:gh search code '<pattern>' --repo redhat-developer/rhdh-operatororredhat-developer/rhdh-chart - Fallback — local clone: clone the repo into a temp directory and grep for the pattern
Key areas to look for:
- rhdh-operator: Backstage CR configuration, CatalogSource setup, operator installation scripts
- rhdh-chart: Helm values schema, chart templates, default configurations
Reference Files
- Main deployment scripts:
.ci/pipelines/openshift-ci-tests.sh,.ci/pipelines/utils.sh - Library scripts:
.ci/pipelines/lib/helm.sh,.ci/pipelines/lib/operators.sh,.ci/pipelines/lib/k8s-wait.sh,.ci/pipelines/lib/testing.sh - Helm values:
.ci/pipelines/value_files/ - ConfigMaps:
.ci/pipelines/resources/config_map/ - Operator CRs:
.ci/pipelines/resources/rhdh-operator/ - Environment variables:
.ci/pipelines/env_variables.sh