name: kubernetes-orchestration description: "Kubernetes container orchestration. Use when deploying to Kubernetes, writing manifests, configuring Helm charts, or troubleshooting cluster issues."
Kubernetes Orchestration
Comprehensive guide for deploying, managing, and scaling applications on Kubernetes.
When to Use
- Deploying applications to Kubernetes
- Writing and reviewing Kubernetes manifests
- Creating Helm charts
- Configuring ingress and networking
- Setting up autoscaling
- Troubleshooting pod and cluster issues
Core Concepts
Resource Hierarchy
Cluster
├── Namespaces
│ ├── Deployments → ReplicaSets → Pods → Containers
│ ├── StatefulSets → Pods
│ ├── DaemonSets → Pods
│ ├── Services (ClusterIP, NodePort, LoadBalancer)
│ ├── Ingress
│ ├── ConfigMaps
│ ├── Secrets
│ ├── PersistentVolumeClaims
│ └── ServiceAccounts
└── Cluster-wide
├── Nodes
├── PersistentVolumes
├── StorageClasses
├── ClusterRoles
└── CustomResourceDefinitions
Namespace Organization
# namespace.yaml
apiVersion: v1
kind: Namespace
metadata:
name: myapp-production
labels:
name: myapp-production
environment: production
team: platform
Namespace Strategy:
namespaces/
├── production
├── staging
├── development
├── monitoring # Prometheus, Grafana
├── logging # ELK, Loki
├── ingress-nginx # Ingress controllers
└── cert-manager # TLS certificates
Deployments
Production-Ready Deployment:
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp
namespace: production
labels:
app: myapp
version: v1.0.0
spec:
replicas: 3
revisionHistoryLimit: 5
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
selector:
matchLabels:
app: myapp
template:
metadata:
labels:
app: myapp
version: v1.0.0
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "9090"
spec:
serviceAccountName: myapp
securityContext:
runAsNonRoot: true
runAsUser: 1000
fsGroup: 1000
containers:
- name: myapp
image: registry.example.com/myapp:v1.0.0
imagePullPolicy: IfNotPresent
ports:
- name: http
containerPort: 8080
protocol: TCP
- name: metrics
containerPort: 9090
protocol: TCP
env:
- name: NODE_ENV
value: "production"
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: myapp-secrets
key: database-url
envFrom:
- configMapRef:
name: myapp-config
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 500m
memory: 512Mi
livenessProbe:
httpGet:
path: /health/live
port: http
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
readinessProbe:
httpGet:
path: /health/ready
port: http
initialDelaySeconds: 5
periodSeconds: 5
timeoutSeconds: 3
failureThreshold: 3
startupProbe:
httpGet:
path: /health/live
port: http
initialDelaySeconds: 10
periodSeconds: 5
failureThreshold: 30
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities:
drop:
- ALL
volumeMounts:
- name: tmp
mountPath: /tmp
- name: config
mountPath: /app/config
readOnly: true
volumes:
- name: tmp
emptyDir: {}
- name: config
configMap:
name: myapp-config
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchLabels:
app: myapp
topologyKey: kubernetes.io/hostname
topologySpreadConstraints:
- maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: ScheduleAnyway
labelSelector:
matchLabels:
app: myapp
terminationGracePeriodSeconds: 30
Services
ClusterIP (Internal):
apiVersion: v1
kind: Service
metadata:
name: myapp
namespace: production
spec:
type: ClusterIP
selector:
app: myapp
ports:
- name: http
port: 80
targetPort: http
protocol: TCP
LoadBalancer (External):
apiVersion: v1
kind: Service
metadata:
name: myapp-lb
namespace: production
annotations:
# AWS NLB
service.beta.kubernetes.io/aws-load-balancer-type: nlb
service.beta.kubernetes.io/aws-load-balancer-internal: "false"
spec:
type: LoadBalancer
selector:
app: myapp
ports:
- name: http
port: 80
targetPort: http
- name: https
port: 443
targetPort: https
Headless Service (StatefulSets):
apiVersion: v1
kind: Service
metadata:
name: myapp-headless
spec:
type: ClusterIP
clusterIP: None
selector:
app: myapp
ports:
- port: 80
Ingress
Nginx Ingress:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: myapp-ingress
namespace: production
annotations:
kubernetes.io/ingress.class: nginx
nginx.ingress.kubernetes.io/ssl-redirect: "true"
nginx.ingress.kubernetes.io/proxy-body-size: "50m"
cert-manager.io/cluster-issuer: letsencrypt-prod
spec:
tls:
- hosts:
- myapp.example.com
- api.example.com
secretName: myapp-tls
rules:
- host: myapp.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: myapp-frontend
port:
number: 80
- host: api.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: myapp-api
port:
number: 80
ConfigMaps & Secrets
ConfigMap:
apiVersion: v1
kind: ConfigMap
metadata:
name: myapp-config
namespace: production
data:
# Simple key-value
LOG_LEVEL: "info"
API_TIMEOUT: "30s"
# File content
config.yaml: |
server:
port: 8080
host: 0.0.0.0
features:
cache: true
debug: false
Secrets:
apiVersion: v1
kind: Secret
metadata:
name: myapp-secrets
namespace: production
type: Opaque
data:
# Base64 encoded
database-url: cG9zdGdyZXM6Ly91c2VyOnBhc3NAZGIvYXBw
api-key: c2VjcmV0LWFwaS1rZXk=
stringData:
# Plain text (will be encoded)
another-secret: "plain-text-value"
External Secrets (with External Secrets Operator):
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: myapp-secrets
spec:
refreshInterval: 1h
secretStoreRef:
kind: ClusterSecretStore
name: aws-secrets-manager
target:
name: myapp-secrets
creationPolicy: Owner
data:
- secretKey: database-url
remoteRef:
key: prod/myapp/database
property: url
StatefulSets
Database StatefulSet:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: postgresql
namespace: production
spec:
serviceName: postgresql-headless
replicas: 3
selector:
matchLabels:
app: postgresql
template:
metadata:
labels:
app: postgresql
spec:
containers:
- name: postgresql
image: postgres:16-alpine
ports:
- containerPort: 5432
env:
- name: POSTGRES_PASSWORD
valueFrom:
secretKeyRef:
name: postgresql-secrets
key: password
- name: PGDATA
value: /var/lib/postgresql/data/pgdata
volumeMounts:
- name: data
mountPath: /var/lib/postgresql/data
resources:
requests:
cpu: 250m
memory: 256Mi
limits:
cpu: 1000m
memory: 1Gi
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: fast-ssd
resources:
requests:
storage: 10Gi
Autoscaling
Horizontal Pod Autoscaler:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: myapp-hpa
namespace: production
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: myapp
minReplicas: 3
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
- type: Pods
pods:
metric:
name: http_requests_per_second
target:
type: AverageValue
averageValue: 1000
behavior:
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 10
periodSeconds: 60
scaleUp:
stabilizationWindowSeconds: 0
policies:
- type: Percent
value: 100
periodSeconds: 15
- type: Pods
value: 4
periodSeconds: 15
selectPolicy: Max
Vertical Pod Autoscaler:
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: myapp-vpa
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: myapp
updatePolicy:
updateMode: Auto
resourcePolicy:
containerPolicies:
- containerName: myapp
minAllowed:
cpu: 100m
memory: 128Mi
maxAllowed:
cpu: 2
memory: 2Gi
RBAC
ServiceAccount:
apiVersion: v1
kind: ServiceAccount
metadata:
name: myapp
namespace: production
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: myapp-role
namespace: production
rules:
- apiGroups: [""]
resources: ["configmaps", "secrets"]
verbs: ["get", "list", "watch"]
- apiGroups: [""]
resources: ["pods"]
verbs: ["get", "list"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: myapp-rolebinding
namespace: production
subjects:
- kind: ServiceAccount
name: myapp
namespace: production
roleRef:
kind: Role
name: myapp-role
apiGroup: rbac.authorization.k8s.io
Network Policies
Restrict Ingress/Egress:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: myapp-network-policy
namespace: production
spec:
podSelector:
matchLabels:
app: myapp
policyTypes:
- Ingress
- Egress
ingress:
- from:
- namespaceSelector:
matchLabels:
name: ingress-nginx
- podSelector:
matchLabels:
app: myapp-frontend
ports:
- protocol: TCP
port: 8080
egress:
- to:
- podSelector:
matchLabels:
app: postgresql
ports:
- protocol: TCP
port: 5432
- to:
- namespaceSelector: {}
podSelector:
matchLabels:
k8s-app: kube-dns
ports:
- protocol: UDP
port: 53
Helm Charts
Chart Structure:
myapp-chart/
├── Chart.yaml
├── values.yaml
├── values-production.yaml
├── templates/
│ ├── _helpers.tpl
│ ├── deployment.yaml
│ ├── service.yaml
│ ├── ingress.yaml
│ ├── configmap.yaml
│ ├── secrets.yaml
│ ├── hpa.yaml
│ └── NOTES.txt
└── charts/ # Dependencies
Chart.yaml:
apiVersion: v2
name: myapp
description: My Application Helm Chart
type: application
version: 1.0.0
appVersion: "1.0.0"
dependencies:
- name: postgresql
version: 12.x.x
repository: https://charts.bitnami.com/bitnami
condition: postgresql.enabled
values.yaml:
replicaCount: 3
image:
repository: registry.example.com/myapp
tag: latest
pullPolicy: IfNotPresent
service:
type: ClusterIP
port: 80
ingress:
enabled: true
className: nginx
hosts:
- host: myapp.example.com
paths:
- path: /
pathType: Prefix
tls:
- secretName: myapp-tls
hosts:
- myapp.example.com
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 500m
memory: 512Mi
autoscaling:
enabled: true
minReplicas: 3
maxReplicas: 20
targetCPUUtilizationPercentage: 70
postgresql:
enabled: true
auth:
database: myapp
Template Example:
# templates/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ include "myapp.fullname" . }}
labels:
{{- include "myapp.labels" . | nindent 4 }}
spec:
{{- if not .Values.autoscaling.enabled }}
replicas: {{ .Values.replicaCount }}
{{- end }}
selector:
matchLabels:
{{- include "myapp.selectorLabels" . | nindent 6 }}
template:
metadata:
labels:
{{- include "myapp.selectorLabels" . | nindent 8 }}
spec:
containers:
- name: {{ .Chart.Name }}
image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}"
ports:
- containerPort: 8080
resources:
{{- toYaml .Values.resources | nindent 12 }}
Commands Reference
# Cluster info
kubectl cluster-info
kubectl get nodes
kubectl top nodes
# Namespace operations
kubectl get namespaces
kubectl create namespace myapp
kubectl config set-context --current --namespace=myapp
# Workloads
kubectl get pods -o wide
kubectl get deployments
kubectl describe pod <pod-name>
kubectl logs <pod-name> -f
kubectl logs <pod-name> -c <container-name>
kubectl exec -it <pod-name> -- /bin/sh
# Apply manifests
kubectl apply -f manifest.yaml
kubectl apply -f ./manifests/
kubectl apply -k ./kustomize/ # Kustomize
# Scaling
kubectl scale deployment myapp --replicas=5
kubectl autoscale deployment myapp --min=3 --max=10 --cpu-percent=70
# Rolling updates
kubectl set image deployment/myapp myapp=myapp:v2
kubectl rollout status deployment/myapp
kubectl rollout history deployment/myapp
kubectl rollout undo deployment/myapp
kubectl rollout undo deployment/myapp --to-revision=2
# Debug
kubectl describe pod <pod-name>
kubectl get events --sort-by=.metadata.creationTimestamp
kubectl run debug --rm -it --image=busybox -- /bin/sh
kubectl port-forward pod/<pod-name> 8080:8080
kubectl cp <pod-name>:/path/to/file ./local-file
# Helm
helm install myapp ./myapp-chart
helm upgrade myapp ./myapp-chart -f values-prod.yaml
helm rollback myapp 1
helm list
helm uninstall myapp
Troubleshooting
| Issue | Debug Command |
|---|---|
| Pod stuck Pending | kubectl describe pod <name> - check events |
| Pod CrashLoopBackOff | kubectl logs <pod> --previous |
| Service not reachable | kubectl get endpoints <service> |
| Ingress not working | Check ingress controller logs |
| Node pressure | kubectl describe node <name> |
| OOMKilled | Increase memory limits |
Common Pod States:
Pending → Check node resources, PVC binding
ContainerCreating → Image pull, volume mount
Running → Healthy
Succeeded → Job completed
Failed → Check logs
Unknown → Node communication issue
Checklist
Before deploying:
- Resource requests/limits defined
- Liveness and readiness probes configured
- Security context (non-root, read-only fs)
- ConfigMaps/Secrets externalized
- Network policies in place
- HPA configured for scaling
- Pod disruption budgets set
- Affinity/anti-affinity rules
- Image tags pinned (not :latest)
Integration
Works with:
/devops- Deployment pipelines/docker- Container images for K8sgitopsskill - ArgoCD/Flux deployments/aws,/gcp,/azure- Managed K8s services/security- Cluster security review