Kubernetes Interview Questions & Answers (2025)

What is Kubernetes and why do we use it?

Basic

Kubernetes (K8s) is an open-source container orchestration platform originally built by Google. It automates the deployment, scaling, and management of containerized applications.

Why use Kubernetes?

Self-healing — restarts failed containers, replaces and reschedules pods
Auto-scaling — scales pods up/down based on CPU, memory, or custom metrics
Rolling updates — deploy new versions with zero downtime
Load balancing — distributes traffic across pod replicas automatically
Storage orchestration — mounts local or cloud storage automatically
Secret management — stores and manages sensitive configuration data

Core architecture:

Control Plane                    Worker Nodes
┌─────────────────┐              ┌────────────────────┐
│  kube-apiserver  │◄────────────│  kubelet           │
│  etcd            │             │  kube-proxy        │
│  kube-scheduler  │             │  container runtime │
│  controller-mgr  │             │  (containerd)      │
└─────────────────┘              └────────────────────┘

Component	Role
`kube-apiserver`	Entry point for all API requests
`etcd`	Distributed key-value store — cluster source of truth
`kube-scheduler`	Assigns pods to nodes based on resources
`kube-controller-manager`	Runs reconciliation loops (ReplicaSet, Node, etc.)
`kubelet`	Node agent — ensures containers run as specified
`kube-proxy`	Maintains network rules for Service IPs

What is a Pod? How is it different from a container?

Basic

A Pod is the smallest deployable unit in Kubernetes. It wraps one or more containers that share the same network namespace, IP address, and storage volumes.

Key differences:

	Container	Pod
Unit	Single process	One or more containers
Network	Own namespace	Shared namespace — containers use `localhost`
Storage	Own filesystem	Shared volumes between containers
Managed by	Docker/containerd	Kubernetes

Single-container Pod (most common):

apiVersion: v1
kind: Pod
metadata:
  name: nginx-pod
  labels:
    app: nginx
spec:
  containers:
  - name: nginx
    image: nginx:1.25
    ports:
    - containerPort: 80
    resources:
      requests:
        cpu: "100m"
        memory: "128Mi"
      limits:
        cpu: "250m"
        memory: "256Mi"

Multi-container Pod (sidecar pattern):

apiVersion: v1
kind: Pod
metadata:
  name: app-with-sidecar
spec:
  containers:
  - name: app           # Main container
    image: my-app:v1
    ports:
    - containerPort: 8080
  - name: log-shipper   # Sidecar — reads logs from shared volume
    image: fluent-bit:latest
    volumeMounts:
    - name: logs
      mountPath: /var/log/app
  volumes:
  - name: logs
    emptyDir: {}

Pods are ephemeral — when a pod dies, it is NOT restarted unless managed by a controller (Deployment, StatefulSet, DaemonSet).

What is the difference between a Deployment, StatefulSet, DaemonSet, and Job?

Basic

Each workload type serves a different purpose:

Kind	Use For	Pod Names	Storage	Ordering
Deployment	Stateless apps (APIs, web)	Random	Ephemeral	Any order
StatefulSet	Stateful apps (DBs, queues)	Stable (`app-0`, `app-1`)	Persistent PVC per pod	Ordered (0→1→2)
DaemonSet	One pod per node (agents)	One per node	Node-local	N/A
Job	Run-to-completion tasks	Random	Optional	N/A
CronJob	Scheduled recurring tasks	Random	Optional	N/A

Deployment — stateless, rolling updates:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: api-server
spec:
  replicas: 3
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0
  selector:
    matchLabels:
      app: api-server
  template:
    metadata:
      labels:
        app: api-server
    spec:
      containers:
      - name: api
        image: my-api:v2

StatefulSet — stable identity for databases:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: postgres
spec:
  serviceName: postgres-headless
  replicas: 3
  selector:
    matchLabels:
      app: postgres
  template:
    metadata:
      labels:
        app: postgres
    spec:
      containers:
      - name: postgres
        image: postgres:15
  volumeClaimTemplates:          # Each pod gets its own PVC
  - metadata:
      name: data
    spec:
      accessModes: ["ReadWriteOnce"]
      resources:
        requests:
          storage: 20Gi

DaemonSet — one pod per node (log agents, monitoring):

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluentd
spec:
  selector:
    matchLabels:
      app: fluentd
  template:
    metadata:
      labels:
        app: fluentd
    spec:
      containers:
      - name: fluentd
        image: fluent/fluentd:v1.16

What are the types of Kubernetes Services and when do you use each?

Basic

A Service gives pods a stable network endpoint. Pods come and go, but the Service IP stays constant.

Type	Accessible From	Port Range	Use Case
`ClusterIP`	Inside cluster only	Any	Internal service-to-service
`NodePort`	Outside via `NodeIP:Port`	30000–32767	Dev/test access
`LoadBalancer`	External via cloud LB	Any	Production external traffic
`ExternalName`	Inside cluster	N/A	Alias to external DNS name

ClusterIP (default):

apiVersion: v1
kind: Service
metadata:
  name: backend-svc
spec:
  type: ClusterIP       # default — can omit this line
  selector:
    app: backend
  ports:
  - port: 80            # Service port
    targetPort: 8080    # Container port

LoadBalancer (production — AWS NLB example):

apiVersion: v1
kind: Service
metadata:
  name: web-svc
  annotations:
    service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
    service.beta.kubernetes.io/aws-load-balancer-scheme: "internet-facing"
spec:
  type: LoadBalancer
  selector:
    app: web
  ports:
  - port: 443
    targetPort: 8080
    protocol: TCP

Headless Service (StatefulSets — direct pod DNS):

apiVersion: v1
kind: Service
metadata:
  name: postgres-headless
spec:
  clusterIP: None       # Makes it headless
  selector:
    app: postgres
  ports:
  - port: 5432
# DNS: postgres-0.postgres-headless.default.svc.cluster.local

What is a ConfigMap and Secret? How do you use them in a Pod?

Basic

ConfigMap stores non-sensitive configuration. Secret stores sensitive data (base64-encoded, can be encrypted at rest).

Create ConfigMap:

apiVersion: v1
kind: ConfigMap
metadata:
  name: app-config
data:
  APP_ENV: "production"
  LOG_LEVEL: "info"
  config.yaml: |
    server:
      port: 8080
      timeout: 30s

Create Secret:

apiVersion: v1
kind: Secret
metadata:
  name: db-secret
type: Opaque
stringData:              # stringData auto-encodes to base64
  DB_HOST: "postgres.prod.svc.cluster.local"
  DB_USER: "appuser"
  DB_PASS: "supersecretpassword"

Use in Pod — as environment variables:

spec:
  containers:
  - name: app
    image: my-app:v1
    env:
    - name: APP_ENV
      valueFrom:
        configMapKeyRef:
          name: app-config
          key: APP_ENV
    - name: DB_PASS
      valueFrom:
        secretKeyRef:
          name: db-secret
          key: DB_PASS
    envFrom:             # Load ALL keys from ConfigMap as env vars
    - configMapRef:
        name: app-config

Use in Pod — as mounted files:

spec:
  containers:
  - name: app
    image: my-app:v1
    volumeMounts:
    - name: config-vol
      mountPath: /etc/app/config.yaml
      subPath: config.yaml
  volumes:
  - name: config-vol
    configMap:
      name: app-config

# Quick commands
kubectl create configmap app-config --from-file=config.yaml
kubectl create secret generic db-secret --from-literal=DB_PASS=mysecret
kubectl get secret db-secret -o jsonpath='{.data.DB_PASS}' | base64 -d

What is a Namespace? Why do we use them?

Basic

A Namespace is a virtual cluster within a physical cluster. It provides a mechanism for isolating groups of resources.

Default namespaces:

kubectl get namespaces
# NAME              STATUS   AGE
# default           Active   30d   ← where resources go if not specified
# kube-system       Active   30d   ← Kubernetes system components
# kube-public       Active   30d   ← publicly readable resources
# kube-node-lease   Active   30d   ← node heartbeat objects

Create and use a namespace:

# Create namespace
kubectl create namespace production
kubectl create namespace staging

# Deploy to a specific namespace
kubectl apply -f deployment.yaml -n production

# Set default namespace for your context
kubectl config set-context --current --namespace=production

# View all resources in a namespace
kubectl get all -n production

# View resources across ALL namespaces
kubectl get pods --all-namespaces
kubectl get pods -A   # shorthand

Namespace with resource quotas:

apiVersion: v1
kind: ResourceQuota
metadata:
  name: production-quota
  namespace: production
spec:
  hard:
    requests.cpu: "10"
    requests.memory: 20Gi
    limits.cpu: "20"
    limits.memory: 40Gi
    pods: "50"
    services: "10"

Key facts:

Some resources are cluster-scoped (Nodes, PersistentVolumes, ClusterRoles) — they do NOT belong to a namespace
DNS format: service-name.namespace.svc.cluster.local
ResourceQuotas and LimitRanges are set per namespace

What are resource requests and limits? What happens when a container exceeds them?

Basic

Requests = minimum guaranteed resources (used for scheduling). Limits = maximum allowed resources (enforced at runtime).

spec:
  containers:
  - name: app
    image: my-app:v1
    resources:
      requests:
        cpu: "250m"       # 0.25 CPU cores — guaranteed
        memory: "256Mi"   # 256 MiB — guaranteed
      limits:
        cpu: "500m"       # 0.5 CPU cores — maximum
        memory: "512Mi"   # 512 MiB — maximum

What happens when limits are exceeded:

Resource	Exceeded Behavior
CPU	Container is throttled — slowed down but NOT killed
Memory	Container is OOMKilled (exit code 137) — killed immediately

# Check if a pod was OOMKilled
kubectl describe pod <pod-name>
# Look for: OOMKilled in the Last State section

# Check exit code
kubectl get pod <pod-name> -o jsonpath='{.status.containerStatuses[0].lastState.terminated.exitCode}'
# 137 = OOMKilled, 1 = app crash, 0 = success

CPU units:

1 CPU    = 1 vCPU = 1 AWS vCPU = 1 GCP Core = 1000m
500m     = 0.5 CPU
100m     = 0.1 CPU (minimum recommended)

Quality of Service (QoS) classes:

Guaranteed  → requests == limits (best — never evicted first)
Burstable   → requests < limits  (middle)
BestEffort  → no requests or limits set (worst — evicted first)

# Check QoS class of a pod
kubectl get pod <pod-name> -o jsonpath='{.status.qosClass}'

What is a PersistentVolume (PV) and PersistentVolumeClaim (PVC)?

Basic

PersistentVolume (PV) — a piece of storage provisioned in the cluster (like an AWS EBS volume). PersistentVolumeClaim (PVC) — a request for storage by a user/pod (like requesting a specific size).

Developer creates PVC  →  K8s finds/creates matching PV  →  Pod mounts PVC

Dynamic provisioning (most common — cloud environments):

# 1. StorageClass defines HOW storage is created
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: fast-ssd
provisioner: ebs.csi.aws.com
parameters:
  type: gp3
  iops: "3000"
  throughput: "125"
reclaimPolicy: Delete
allowVolumeExpansion: true

---
# 2. PVC requests storage (PV created automatically)
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: database-pvc
spec:
  accessModes:
  - ReadWriteOnce      # RWO = one node at a time
  storageClassName: fast-ssd
  resources:
    requests:
      storage: 50Gi

---
# 3. Pod mounts the PVC
apiVersion: v1
kind: Pod
metadata:
  name: postgres
spec:
  containers:
  - name: postgres
    image: postgres:15
    volumeMounts:
    - name: data
      mountPath: /var/lib/postgresql/data
  volumes:
  - name: data
    persistentVolumeClaim:
      claimName: database-pvc

Access modes:

Mode	Short	Description
`ReadWriteOnce`	RWO	Read/write by one node
`ReadOnlyMany`	ROX	Read-only by many nodes
`ReadWriteMany`	RWX	Read/write by many nodes (NFS, EFS)

kubectl get pv               # List all PersistentVolumes
kubectl get pvc              # List all PersistentVolumeClaims
kubectl describe pvc my-pvc  # Check binding status

What is an Ingress and how is it different from a Service?

Intermediate

A Service exposes pods inside the cluster or with a single external IP/port. An Ingress is an API object that manages HTTP/HTTPS routing to multiple services from a single entry point — like a smart router.

Internet → Ingress Controller (NGINX/ALB) → Ingress Rules → Services → Pods

Without Ingress — need a LoadBalancer per service (expensive):

app.com    → LoadBalancer 1 ($$$) → Service A
api.com    → LoadBalancer 2 ($$$) → Service B
admin.com  → LoadBalancer 3 ($$$) → Service C

With Ingress — one LoadBalancer, smart routing:

                    ┌── /        → Service A (frontend)
LoadBalancer ───────┼── /api/    → Service B (backend)
                    └── /admin/  → Service C (admin)

Ingress resource:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: app-ingress
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /
    cert-manager.io/cluster-issuer: letsencrypt-prod
spec:
  ingressClassName: nginx
  tls:
  - hosts:
    - yourdomain.com
    secretName: tls-cert-secret
  rules:
  - host: yourdomain.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: frontend-svc
            port:
              number: 80
      - path: /api
        pathType: Prefix
        backend:
          service:
            name: backend-svc
            port:
              number: 8080

Install NGINX Ingress Controller (most common):

helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm install ingress-nginx ingress-nginx/ingress-nginx \
  --namespace ingress-nginx \
  --create-namespace

Q10

How does Kubernetes networking work? Explain the CNI and how pods communicate.

Intermediate

Kubernetes networking follows 3 fundamental rules:

Every pod gets its own IP address
Pods can communicate with all other pods without NAT
Nodes can communicate with all pods without NAT

CNI (Container Network Interface) plugins implement these rules:

CNI Plugin	Use Case	Features
Calico	Most popular	NetworkPolicy, BGP, eBPF
Flannel	Simple, lightweight	Basic overlay network
Cilium	High performance	eBPF, L7 policies, observability
Weave	Easy setup	Encrypted by default
AWS VPC CNI	EKS native	Pods get real VPC IPs

Pod-to-Pod communication:

Same node:    Pod A → veth → cbr0 bridge → veth → Pod B
Across nodes: Pod A → veth → cbr0 → eth0 → [overlay/BGP] → eth0 → cbr0 → Pod B

DNS resolution in the cluster:

# Pod DNS format
<pod-ip-dashes>.<namespace>.pod.cluster.local
# Example: 10-0-0-1.default.pod.cluster.local

# Service DNS format
<service-name>.<namespace>.svc.cluster.local
# Example: my-svc.production.svc.cluster.local

# Test DNS from inside a pod
kubectl run dns-test --image=busybox --rm -it --restart=Never -- nslookup my-svc.production

NetworkPolicy — restrict traffic between pods:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: backend-netpol
  namespace: production
spec:
  podSelector:
    matchLabels:
      app: backend         # Apply to backend pods
  policyTypes:
  - Ingress
  - Egress
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: frontend    # Only allow traffic from frontend pods
    ports:
    - protocol: TCP
      port: 8080
  egress:
  - to:
    - podSelector:
        matchLabels:
          app: postgres    # Backend can only talk to postgres
    ports:
    - protocol: TCP
      port: 5432

Q11

What is RBAC in Kubernetes? Create a read-only role for a developer.

Intermediate

RBAC (Role-Based Access Control) controls who can do what in Kubernetes. It uses 4 objects:

Object	Scope	Purpose
`Role`	Namespace	Defines permissions within a namespace
`ClusterRole`	Cluster-wide	Defines cluster-wide permissions
`RoleBinding`	Namespace	Binds Role/ClusterRole to users/groups/SAs in a namespace
`ClusterRoleBinding`	Cluster-wide	Binds ClusterRole cluster-wide

Complete read-only setup for a developer:

# 1. Role — read-only in 'development' namespace
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: developer-readonly
  namespace: development
rules:
- apiGroups: [""]                    # Core API group
  resources: ["pods", "pods/log", "services", "configmaps", "endpoints"]
  verbs: ["get", "list", "watch"]
- apiGroups: ["apps"]
  resources: ["deployments", "replicasets", "statefulsets", "daemonsets"]
  verbs: ["get", "list", "watch"]
- apiGroups: ["batch"]
  resources: ["jobs", "cronjobs"]
  verbs: ["get", "list", "watch"]

---
# 2. RoleBinding — attach role to a user
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: developer-readonly-binding
  namespace: development
subjects:
- kind: User
  name: john@company.com   # IAM user or OIDC user
  apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: Role
  name: developer-readonly
  apiGroup: rbac.authorization.k8s.io

Test permissions:

# Check what a user can do
kubectl auth can-i list pods -n development --as=john@company.com
# yes

kubectl auth can-i delete pods -n development --as=john@company.com
# no

# List all permissions for a user
kubectl auth can-i --list -n development --as=john@company.com

Q12

How does HorizontalPodAutoscaler (HPA) work? Write a complete HPA config.

Intermediate

HPA automatically adjusts the number of Pod replicas based on observed metrics (CPU, memory, or custom).

How it works:

Metrics Server → HPA Controller (polls every 15s) → adjusts replica count

Requirements:

Metrics Server must be installed in the cluster
Pods must define resource requests (otherwise HPA cannot calculate utilization)

Complete HPA with CPU + Memory + Custom metric:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: api-hpa
  namespace: production
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: api-server
  minReplicas: 2
  maxReplicas: 20
  metrics:
  # Scale on CPU utilization
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70    # Target 70% CPU across all pods
  # Scale on Memory
  - type: Resource
    resource:
      name: memory
      target:
        type: AverageValue
        averageValue: 512Mi       # Target 512Mi per pod
  # Scale on custom metric (requests per second)
  - type: Pods
    pods:
      metric:
        name: http_requests_per_second
      target:
        type: AverageValue
        averageValue: "1000"
  # Scaling behaviour (prevent flapping)
  behavior:
    scaleUp:
      stabilizationWindowSeconds: 60      # Wait 60s before scaling up again
      policies:
      - type: Pods
        value: 4                          # Max 4 pods per scale-up
        periodSeconds: 60
    scaleDown:
      stabilizationWindowSeconds: 300     # Wait 5 min before scaling down
      policies:
      - type: Percent
        value: 10                         # Scale down max 10% at a time
        periodSeconds: 60

# Install Metrics Server (required for HPA)
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

# Watch HPA in action
kubectl get hpa -n production -w

# Check current metrics
kubectl top pods -n production

Q13

What is Helm and why is it used in Kubernetes?

Intermediate

Helm is the package manager for Kubernetes. It lets you define, install, and upgrade complex Kubernetes applications using charts (packaged YAML templates).

Without Helm — manually apply 10+ YAML files:

kubectl apply -f deployment.yaml
kubectl apply -f service.yaml
kubectl apply -f ingress.yaml
kubectl apply -f configmap.yaml
kubectl apply -f secret.yaml
# ...and more

With Helm — one command:

helm install my-app ./my-chart --namespace production

Chart structure:

my-chart/
├── Chart.yaml         ← Chart metadata (name, version, description)
├── values.yaml        ← Default configuration values
├── templates/
│   ├── deployment.yaml
│   ├── service.yaml
│   ├── ingress.yaml
│   ├── configmap.yaml
│   └── _helpers.tpl   ← Reusable template functions
└── charts/            ← Dependency charts

Key Helm commands:

# Add a chart repository
helm repo add bitnami https://charts.bitnami.com/bitnami
helm repo update

# Search for charts
helm search repo nginx

# Install a chart
helm install my-nginx bitnami/nginx \
  --namespace web \
  --create-namespace \
  --set replicaCount=3 \
  --set service.type=LoadBalancer

# Install with custom values file
helm install my-nginx bitnami/nginx -f custom-values.yaml

# Upgrade a release
helm upgrade my-nginx bitnami/nginx --set image.tag=1.25

# Rollback to previous version
helm rollback my-nginx 1

# List all releases
helm list --all-namespaces

# Uninstall a release
helm uninstall my-nginx -n web

# Render templates without installing (dry-run)
helm template my-nginx bitnami/nginx
helm install my-nginx bitnami/nginx --dry-run --debug

Q14

What are Taints and Tolerations? Give a real-world example.

Intermediate

Taints are applied to nodes to repel pods from being scheduled there. Tolerations are applied to pods to allow them to schedule on tainted nodes.

Taint on node  →  repels all pods
Toleration on pod  →  "I can tolerate this taint, schedule me here"

Taint effects:

Effect	Behavior
`NoSchedule`	New pods without toleration won’t schedule here
`PreferNoSchedule`	Scheduler tries to avoid, but not guaranteed
`NoExecute`	New pods won’t schedule AND existing pods are evicted

Real-world example — dedicated GPU nodes:

# 1. Taint the GPU node
kubectl taint nodes gpu-node-1 dedicated=gpu:NoSchedule

# 2. View taints on a node
kubectl describe node gpu-node-1 | grep -i taint

# 3. Remove a taint (add minus at the end)
kubectl taint nodes gpu-node-1 dedicated=gpu:NoSchedule-

# 4. Pod with toleration can schedule on GPU node
apiVersion: v1
kind: Pod
metadata:
  name: ml-training
spec:
  tolerations:
  - key: "dedicated"
    operator: "Equal"
    value: "gpu"
    effect: "NoSchedule"
  nodeSelector:
    dedicated: gpu           # Also use nodeSelector to ATTRACT to GPU node
  containers:
  - name: training
    image: tensorflow/tensorflow:latest-gpu
    resources:
      limits:
        nvidia.com/gpu: 1

Common real-world taints:

# Taint spot instances (avoid running critical workloads)
kubectl taint nodes spot-node-1 spot=true:NoSchedule

# Taint nodes for maintenance
kubectl taint nodes node-1 maintenance=true:NoExecute

# Dedicated nodes for monitoring stack
kubectl taint nodes monitoring-node dedicated=monitoring:NoSchedule

Q15

What is the difference between liveness, readiness, and startup probes?

Intermediate

Kubernetes uses three types of probes to monitor container health:

Probe	Question	Failure Action	When Used
Liveness	Is the app alive?	Restart container	App is stuck/deadlocked
Readiness	Is the app ready for traffic?	Remove from Service endpoints	App still initializing
Startup	Has the app started?	Restart container (overrides liveness during startup)	Slow-starting apps

Complete probe configuration:

spec:
  containers:
  - name: api
    image: my-api:v1
    ports:
    - containerPort: 8080

    # Startup probe — give app 5 mins to start (30 * 10s)
    # Liveness/Readiness are paused until startup succeeds
    startupProbe:
      httpGet:
        path: /health/startup
        port: 8080
      failureThreshold: 30
      periodSeconds: 10

    # Liveness probe — restart if app is dead
    livenessProbe:
      httpGet:
        path: /health/live
        port: 8080
      initialDelaySeconds: 10   # Wait 10s after container starts
      periodSeconds: 15         # Check every 15s
      failureThreshold: 3       # Restart after 3 failures
      timeoutSeconds: 5         # Fail if no response in 5s

    # Readiness probe — remove from load balancer if not ready
    readinessProbe:
      httpGet:
        path: /health/ready
        port: 8080
      initialDelaySeconds: 5
      periodSeconds: 10
      failureThreshold: 3
      successThreshold: 1       # Need 1 success to be marked ready

Probe types:

# HTTP GET — most common
httpGet:
  path: /healthz
  port: 8080
  httpHeaders:
  - name: X-Health-Check
    value: "true"

# TCP Socket — for non-HTTP services
tcpSocket:
  port: 5432     # Just checks if port is open

# Exec command — run a command inside container
exec:
  command:
  - /bin/sh
  - -c
  - "redis-cli ping | grep PONG"

Q16

What is a ServiceAccount and when do you use it?

Intermediate

A ServiceAccount provides an identity for processes running inside a Pod to interact with the Kubernetes API. Every pod automatically gets the default ServiceAccount if not specified.

Why use custom ServiceAccounts?

Grant specific pods only the permissions they need (least privilege)
Use with IRSA (IAM Roles for Service Accounts) on EKS for AWS access
Audit trail — know which pod made which API call

Create a ServiceAccount with RBAC:

# 1. Create ServiceAccount
apiVersion: v1
kind: ServiceAccount
metadata:
  name: deployment-manager
  namespace: production

---
# 2. Create Role with needed permissions
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: deployment-role
  namespace: production
rules:
- apiGroups: ["apps"]
  resources: ["deployments"]
  verbs: ["get", "list", "update", "patch"]

---
# 3. Bind the Role to the ServiceAccount
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: deployment-role-binding
  namespace: production
subjects:
- kind: ServiceAccount
  name: deployment-manager
  namespace: production
roleRef:
  kind: Role
  name: deployment-role
  apiGroup: rbac.authorization.k8s.io

---
# 4. Use ServiceAccount in Pod
apiVersion: v1
kind: Pod
metadata:
  name: my-app
spec:
  serviceAccountName: deployment-manager    # Attach the SA here
  automountServiceAccountToken: true
  containers:
  - name: app
    image: my-app:v1

EKS IRSA — give pods AWS IAM permissions:

# Associate ServiceAccount with IAM role (no access keys needed in pods!)
eksctl create iamserviceaccount \
  --cluster my-cluster \
  --namespace production \
  --name s3-access-sa \
  --attach-policy-arn arn:aws:iam::aws:policy/AmazonS3ReadOnlyAccess \
  --approve

Q17

How do rolling updates work in Kubernetes? How do you do a rollback?

Intermediate

Rolling update replaces old pods with new ones gradually — ensuring zero downtime by keeping some old pods running until new ones are ready.

Old pods: [v1] [v1] [v1]
Step 1:   [v1] [v1] [v2]   ← 1 new pod added
Step 2:   [v1] [v2] [v2]   ← 1 old pod removed
Step 3:   [v2] [v2] [v2]   ← update complete

Configure rolling update strategy:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-app
spec:
  replicas: 6
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 2          # Max 2 extra pods above desired count
      maxUnavailable: 0    # Never go below 6 pods (zero downtime)
  template:
    spec:
      containers:
      - name: web
        image: web-app:v2.0

Perform and monitor an update:

# Update the image
kubectl set image deployment/web-app web=web-app:v2.0 -n production

# Or edit deployment directly
kubectl edit deployment web-app -n production

# Watch rollout progress
kubectl rollout status deployment/web-app -n production
# Waiting for deployment "web-app" rollout to finish: 2 out of 6 new replicas...

# View rollout history
kubectl rollout history deployment/web-app

# View details of a specific revision
kubectl rollout history deployment/web-app --revision=2

Rollback:

# Rollback to previous version immediately
kubectl rollout undo deployment/web-app -n production

# Rollback to a specific revision
kubectl rollout undo deployment/web-app --to-revision=3

# Pause a rollout (freeze mid-rollout)
kubectl rollout pause deployment/web-app

# Resume a paused rollout
kubectl rollout resume deployment/web-app

Recreate strategy (causes downtime — for stateful apps):

strategy:
  type: Recreate    # All old pods killed, then all new pods started

Q18

What are Init Containers? Give a real-world use case.

Intermediate

Init Containers run to completion before any application containers start. They run sequentially — each must succeed before the next starts.

Use cases:

Wait for a dependency (database) to be ready
Download configuration/secrets before app starts
Set file permissions on shared volumes
Run database migrations before app starts

Real-world example — wait for DB + run migrations:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-app
spec:
  template:
    spec:
      initContainers:

      # Init 1 — wait for database to be ready
      - name: wait-for-db
        image: busybox:1.36
        command:
        - /bin/sh
        - -c
        - |
          until nc -z postgres-svc 5432; do
            echo "Waiting for database..."
            sleep 5
          done
          echo "Database is ready!"          

      # Init 2 — run database migrations
      - name: run-migrations
        image: my-app:v2.0
        command: ["python", "manage.py", "migrate"]
        env:
        - name: DATABASE_URL
          valueFrom:
            secretKeyRef:
              name: db-secret
              key: DATABASE_URL

      # Main application container starts ONLY after both inits succeed
      containers:
      - name: web-app
        image: my-app:v2.0
        ports:
        - containerPort: 8000

Key differences from regular containers:

	Init Container	Regular Container
Runs	Before app containers	Parallel with other containers
Completion	Must finish (exit 0)	Runs indefinitely
Restart	Restarts until success	Based on `restartPolicy`
Probes	No liveness/readiness	Supported

Q19

How do you implement Pod Disruption Budgets (PDB) for high availability?

Intermediate

A PodDisruptionBudget (PDB) limits the number of pods that can be simultaneously unavailable during voluntary disruptions (node drains, upgrades) — ensuring your app stays highly available during maintenance.

Voluntary disruptions (PDB protects against these):

kubectl drain node during upgrades
Cluster autoscaler scaling down nodes
Node pool upgrades

Involuntary disruptions (PDB does NOT protect):

Node hardware failure
Kernel panic
OOMKill

Create a PDB:

# Option 1: minAvailable — always keep at least N pods running
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: web-app-pdb
  namespace: production
spec:
  minAvailable: 2           # Always keep at least 2 pods up
  selector:
    matchLabels:
      app: web-app

---
# Option 2: maxUnavailable — allow at most N pods to be down
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: api-pdb
  namespace: production
spec:
  maxUnavailable: 1         # Allow at most 1 pod to be unavailable
  selector:
    matchLabels:
      app: api-server

---
# Option 3: percentage
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: backend-pdb
spec:
  minAvailable: "80%"       # Always keep 80% of pods available
  selector:
    matchLabels:
      app: backend

# View PDBs
kubectl get pdb -n production

# Check PDB status
kubectl describe pdb web-app-pdb -n production
# Shows: Allowed disruptions, Current pods, etc.

# Drain a node (respects PDBs)
kubectl drain node-1 --ignore-daemonsets --delete-emptydir-data

Q20

How does Kubernetes handle secrets securely? What are the best practices?

Intermediate

By default, Kubernetes Secrets are stored base64-encoded (NOT encrypted) in etcd. This means anyone with etcd access can read them.

Best practices for Kubernetes Secret management:

1. Enable Encryption at Rest:

# /etc/kubernetes/encryption-config.yaml (on API server)
apiVersion: apiserver.config.k8s.io/v1
kind: EncryptionConfiguration
resources:
- resources:
  - secrets
  providers:
  - aescbc:
      keys:
      - name: key1
        secret: <base64-encoded-32-byte-key>
  - identity: {}   # Fallback for unencrypted data

2. Use External Secret Managers (recommended for production):

# External Secrets Operator — syncs AWS Secrets Manager → K8s Secret
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: db-credentials
  namespace: production
spec:
  refreshInterval: 1h
  secretStoreRef:
    name: aws-secretsmanager
    kind: ClusterSecretStore
  target:
    name: db-secret              # Creates this K8s Secret
  data:
  - secretKey: DB_PASSWORD
    remoteRef:
      key: prod/myapp/database   # AWS Secrets Manager path
      property: password

3. RBAC to limit Secret access:

# Only allow specific service accounts to read secrets
rules:
- apiGroups: [""]
  resources: ["secrets"]
  resourceNames: ["db-secret", "api-key"]  # Specific secrets only
  verbs: ["get"]

4. Never do these:

# ❌ Never put secrets in ConfigMaps
# ❌ Never commit secrets to Git
# ❌ Never echo secrets in logs
# ❌ Never use environment variables for very sensitive data
#    (they can appear in process lists)

# ✅ Mount secrets as files instead
volumeMounts:
- name: db-credentials
  mountPath: /etc/secrets
  readOnly: true
volumes:
- name: db-credentials
  secret:
    secretName: db-secret
    defaultMode: 0400    # Read-only for owner only

# Audit who accessed a secret
kubectl get events | grep secret
# Enable audit logs in kube-apiserver for full audit trail

Q21

How do you troubleshoot a Pod stuck in 'Pending' state?

Advanced

A pod in Pending means the scheduler cannot find a suitable node. This is always a resource or constraint issue.

Systematic diagnosis:

# Step 1 — describe the pod (most important command)
kubectl describe pod <pod-name> -n <namespace>
# Look at the 'Events' section at the bottom

# Step 2 — check node resources
kubectl describe nodes | grep -A 5 "Allocated resources"
kubectl top nodes

# Step 3 — check if PVC is bound (if pod mounts one)
kubectl get pvc -n <namespace>

Common causes and fixes:

Cause 1: Insufficient CPU/Memory:

# Events will show:
# "0/3 nodes are available: 3 Insufficient cpu"

# Fix: Scale up node group or reduce resource requests
kubectl get nodes
kubectl describe node <node> | grep -A 10 "Allocated resources"

# Check what's using resources
kubectl top pods --all-namespaces --sort-by=cpu

Cause 2: No nodes match nodeSelector/Affinity:

# Events: "0/3 nodes are available: 3 node(s) didn't match Pod's node affinity"

# Check node labels
kubectl get nodes --show-labels

# Add missing label to node
kubectl label node <node-name> disktype=ssd

Cause 3: PVC not bound:

# Events: "persistentvolumeclaim not found" or PVC stuck in Pending

kubectl describe pvc <pvc-name>
# Check if StorageClass exists
kubectl get storageclass

Cause 4: Taint not tolerated:

# Events: "0/3 nodes are available: 3 node(s) had untolerated taint"

kubectl describe nodes | grep -i taint
# Add toleration to pod spec

Cause 5: Too many pods on nodes (maxPods limit):

# Each node has a default limit of 110 pods
kubectl describe node <node> | grep "Non-terminated Pods"

Quick diagnosis script:

# One command to see all pending pods and their reason
kubectl get pods --all-namespaces --field-selector=status.phase=Pending
kubectl describe pods --all-namespaces | grep -A 10 "Events:"

Q22

How do you troubleshoot a Pod in 'CrashLoopBackOff'?

Advanced

CrashLoopBackOff = the container starts, crashes, Kubernetes restarts it — in a loop. The backoff time doubles each time (10s → 20s → 40s → … up to 5 min).

Systematic diagnosis:

# Step 1 — describe pod for events and exit codes
kubectl describe pod <pod-name> -n <namespace>

# Step 2 — current logs (may be empty if app crashes immediately)
kubectl logs <pod-name> -n <namespace>

# Step 3 — PREVIOUS container logs (before the crash) — most useful
kubectl logs <pod-name> -n <namespace> --previous

# Step 4 — check exit code
kubectl get pod <pod-name> -o jsonpath='{.status.containerStatuses[0].lastState.terminated}'

Exit code reference:

Exit Code	Meaning	Fix
`0`	Success (not a crash issue)	Check restart policy
`1`	App error	Check app logs
`137`	OOMKilled (Out of Memory)	Increase memory limit
`139`	Segfault	Bug in app or wrong binary
`143`	SIGTERM — graceful shutdown	Check if liveness probe is too aggressive

Common fixes:

# Fix OOMKill (exit 137) — increase memory limit
kubectl patch deployment <name> -p \
  '{"spec":{"template":{"spec":{"containers":[{"name":"app","resources":{"limits":{"memory":"1Gi"}}}]}}}}'

# Fix: App can't connect to database
# Check if DB service is reachable from pod
kubectl exec -it <pod-name> -- nc -zv postgres-svc 5432

# Fix: Wrong image command — override to debug
kubectl run debug-pod \
  --image=<same-image> \
  --restart=Never \
  --rm -it \
  --command -- /bin/sh

# Fix: Liveness probe killing app too early — increase initialDelaySeconds
livenessProbe:
  initialDelaySeconds: 60    # Give app more time to start
  failureThreshold: 5

# Fix: Missing environment variable or secret
kubectl exec -it <pod-name> -- env | grep DB_   # Check env vars
kubectl describe pod <pod-name> | grep -A 5 "Environment"

Q23

How do you perform zero-downtime deployments in Kubernetes?

Advanced

Zero-downtime deployments require a combination of correct deployment strategy, pod lifecycle hooks, and health probes.

Complete zero-downtime deployment configuration:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-app
  namespace: production
spec:
  replicas: 6
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 2          # Can temporarily have 8 pods (6+2)
      maxUnavailable: 0    # Never drop below 6 healthy pods
  selector:
    matchLabels:
      app: web-app
  template:
    metadata:
      labels:
        app: web-app
    spec:
      # 1. Give pods time to finish in-flight requests before shutdown
      terminationGracePeriodSeconds: 60

      containers:
      - name: web
        image: web-app:v2.0
        ports:
        - containerPort: 8080

        # 2. Readiness probe — pod only gets traffic when truly ready
        readinessProbe:
          httpGet:
            path: /ready
            port: 8080
          initialDelaySeconds: 10
          periodSeconds: 5
          failureThreshold: 3

        # 3. Liveness probe — restart if pod is dead
        livenessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 30
          periodSeconds: 15

        # 4. preStop hook — wait for traffic to drain before shutdown
        lifecycle:
          preStop:
            exec:
              command:
              - /bin/sh
              - -c
              - sleep 15    # Wait 15s for load balancer to remove pod from rotation

        resources:
          requests:
            cpu: "250m"
            memory: "256Mi"
          limits:
            cpu: "500m"
            memory: "512Mi"

Canary deployment pattern:

# Deploy v2 to 10% of traffic first
kubectl scale deployment web-app-v1 --replicas=9    # 9 old pods
kubectl scale deployment web-app-v2 --replicas=1    # 1 new pod (10%)

# If v2 is healthy, gradually increase
kubectl scale deployment web-app-v2 --replicas=5    # 50%
kubectl scale deployment web-app-v2 --replicas=10   # 100%
kubectl scale deployment web-app-v1 --replicas=0    # Remove old

Blue-Green deployment:

# Switch Service selector from blue to green instantly
kubectl patch service web-svc \
  -p '{"spec":{"selector":{"version":"v2"}}}'

# Rollback instantly by switching back
kubectl patch service web-svc \
  -p '{"spec":{"selector":{"version":"v1"}}}'

Q24

How does Kubernetes etcd work? What happens if etcd goes down?

Advanced

etcd is a distributed, consistent key-value store that serves as Kubernetes’ source of truth. Every object (pods, services, configmaps, secrets) is stored in etcd.

Architecture:

All cluster state stored in etcd:
/registry/pods/default/my-pod
/registry/services/default/my-svc
/registry/deployments/production/web-app
/registry/secrets/default/db-secret

etcd uses the Raft consensus algorithm:

Requires a quorum (majority) to function: (n/2) + 1
3 members → can tolerate 1 failure
5 members → can tolerate 2 failures
Always use odd numbers of etcd members

Cluster Size	Quorum	Tolerable Failures
1	1	0
3	2	1
5	3	2
7	4	3

What happens when etcd goes down:

etcd down → API server cannot read/write state
           → No new pods can be scheduled
           → Existing pods keep running (kubelet works independently)
           → kubectl commands fail
           → New deployments fail

Backup etcd (critical for disaster recovery):

# Take an etcd snapshot
ETCDCTL_API=3 etcdctl snapshot save /backup/etcd-snapshot-$(date +%Y%m%d).db \
  --endpoints=https://127.0.0.1:2379 \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  --cert=/etc/kubernetes/pki/etcd/server.crt \
  --key=/etc/kubernetes/pki/etcd/server.key

# Verify the snapshot
ETCDCTL_API=3 etcdctl snapshot status /backup/etcd-snapshot-$(date +%Y%m%d).db

# Restore from snapshot
ETCDCTL_API=3 etcdctl snapshot restore /backup/etcd-snapshot.db \
  --data-dir=/var/lib/etcd-restored

# Automate daily backups via CronJob
kubectl apply -f etcd-backup-cronjob.yaml

Best practices:

Always run etcd on separate dedicated nodes from worker nodes
Use SSDs — etcd is I/O intensive
Monitor etcd latency (should be < 10ms)
Take snapshots before every cluster upgrade

Q25

How do you secure a Kubernetes cluster? Walk through the security layers.

Advanced

Kubernetes security is a defence-in-depth approach with multiple layers:

Layer 1 — API Server security:

# Restrict anonymous access
--anonymous-auth=false

# Enable audit logging
--audit-log-path=/var/log/kubernetes/audit.log
--audit-policy-file=/etc/kubernetes/audit-policy.yaml

# Disable insecure port
--insecure-port=0

Layer 2 — RBAC (least privilege):

# Never use cluster-admin in applications
# Create minimal roles per service

# Audit RBAC permissions
kubectl auth can-i --list --as=system:serviceaccount:production:my-sa

Layer 3 — Network Policies (zero-trust networking):

# Deny all traffic by default, then allow explicitly
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: deny-all
  namespace: production
spec:
  podSelector: {}          # Applies to ALL pods in namespace
  policyTypes:
  - Ingress
  - Egress
  # No rules = deny all

Layer 4 — Pod Security (Security Context):

spec:
  securityContext:
    runAsNonRoot: true       # Never run as root
    runAsUser: 1000
    fsGroup: 2000
    seccompProfile:
      type: RuntimeDefault   # Enable seccomp filtering

  containers:
  - name: app
    image: my-app:v1
    securityContext:
      allowPrivilegeEscalation: false
      readOnlyRootFilesystem: true    # Container cannot write to filesystem
      capabilities:
        drop:
        - ALL                          # Drop all Linux capabilities
        add:
        - NET_BIND_SERVICE             # Add only what you need

Layer 5 — Image Security:

# Scan images before pushing
trivy image my-app:v1
grype my-app:v1

# Use Image Policy Webhook to block vulnerable images
# Use private registry — never use :latest tag in production

Layer 6 — Secrets Management:

# Enable encryption at rest for etcd
# Use External Secrets Operator with AWS Secrets Manager / Vault
# Rotate secrets regularly

Layer 7 — Runtime Security:

# Use Falco for runtime threat detection
helm install falco falcosecurity/falco \
  --namespace falco-system \
  --create-namespace

# Falco detects: shell in containers, privilege escalation,
# unexpected network connections, file system changes

Pod Security Admission (replaces deprecated PodSecurityPolicy):

# Label namespace to enforce security standards
apiVersion: v1
kind: Namespace
metadata:
  name: production
  labels:
    pod-security.kubernetes.io/enforce: restricted    # Most strict
    pod-security.kubernetes.io/audit: restricted
    pod-security.kubernetes.io/warn: restricted

Q26

What is Kubernetes Operator pattern? When would you build a custom operator?

Advanced

A Kubernetes Operator is a method of packaging, deploying, and managing a Kubernetes application using Custom Resource Definitions (CRDs) and custom controllers that encode operational knowledge.

The Operator pattern:

Human Operator knowledge → encoded in → Custom Controller
                                           ↓
CRD (custom resource) → Controller reconciles → Desired state

When to build a Kubernetes Operator:

Managing stateful applications (databases, message queues)
Automating complex operational tasks (backups, upgrades, failover)
When your app needs more than Deployment/StatefulSet
Encoding domain-specific knowledge (e.g., how to scale a database cluster)

Example CRD — custom database resource:

# 1. Define the Custom Resource Definition
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: postgresclusters.db.example.com
spec:
  group: db.example.com
  versions:
  - name: v1
    served: true
    storage: true
    schema:
      openAPIV3Schema:
        type: object
        properties:
          spec:
            type: object
            properties:
              replicas:
                type: integer
              version:
                type: string
              backupSchedule:
                type: string
  scope: Namespaced
  names:
    plural: postgresclusters
    singular: postgrescluster
    kind: PostgresCluster

---
# 2. Use the custom resource (like any K8s object now)
apiVersion: db.example.com/v1
kind: PostgresCluster
metadata:
  name: my-database
  namespace: production
spec:
  replicas: 3
  version: "15.2"
  backupSchedule: "0 2 * * *"    # Operator handles backups automatically

Popular real-world operators:

# Install cert-manager operator (manages TLS certificates)
helm install cert-manager jetstack/cert-manager --set installCRDs=true

# Install Prometheus operator (manages monitoring stack)
helm install kube-prometheus-stack prometheus-community/kube-prometheus-stack

# Install Strimzi operator (manages Kafka clusters)
helm install strimzi-kafka-operator strimzi/strimzi-kafka-operator

Build your own operator:

# Use Operator SDK (most popular framework)
operator-sdk init --domain example.com --repo github.com/example/my-operator
operator-sdk create api --group apps --version v1 --kind MyApp --resource --controller

# Or use Kubebuilder
kubebuilder init --domain example.com
kubebuilder create api --group apps --version v1 --kind MyApp

Q27

Explain Kubernetes resource management — LimitRange, ResourceQuota, and Priority Classes.

Advanced

Three mechanisms control resource usage at different levels:

1. ResourceQuota — limits resources for an entire namespace:

apiVersion: v1
kind: ResourceQuota
metadata:
  name: production-quota
  namespace: production
spec:
  hard:
    # Compute resources
    requests.cpu: "20"           # Max 20 CPU cores requested in namespace
    requests.memory: 40Gi        # Max 40Gi memory requested
    limits.cpu: "40"
    limits.memory: 80Gi

    # Object count limits
    pods: "100"
    services: "20"
    persistentvolumeclaims: "30"
    secrets: "50"
    configmaps: "50"

    # Storage limits
    requests.storage: "500Gi"
    storageclass.storage.k8s.io/fast-ssd.requests.storage: "200Gi"

2. LimitRange — sets defaults and limits per Pod/Container:

apiVersion: v1
kind: LimitRange
metadata:
  name: production-limitrange
  namespace: production
spec:
  limits:
  # Container-level defaults and max/min
  - type: Container
    default:               # Default LIMIT if not specified
      cpu: "500m"
      memory: "256Mi"
    defaultRequest:        # Default REQUEST if not specified
      cpu: "100m"
      memory: "128Mi"
    max:                   # Maximum any container can request
      cpu: "4"
      memory: "8Gi"
    min:                   # Minimum any container must request
      cpu: "50m"
      memory: "64Mi"

  # Pod-level maximum
  - type: Pod
    max:
      cpu: "8"
      memory: "16Gi"

  # PVC storage limits
  - type: PersistentVolumeClaim
    max:
      storage: "100Gi"
    min:
      storage: "1Gi"

3. PriorityClass — controls eviction order during resource pressure:

# High priority — for critical system workloads
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
  name: high-priority
value: 1000000
globalDefault: false
description: "Critical production services"

---
# Low priority — for batch jobs
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
  name: low-priority
value: 100
description: "Batch jobs and non-critical workloads"

---
# Use in Pod
spec:
  priorityClassName: high-priority    # This pod won't be evicted first
  containers:
  - name: critical-app
    image: my-app:v1

Eviction order during node pressure:

BestEffort (no requests/limits) → evicted FIRST
Burstable (requests < limits)   → evicted SECOND
Guaranteed (requests = limits)  → evicted LAST

Q28

How do you implement GitOps with Kubernetes using ArgoCD?

Advanced

GitOps is a deployment methodology where Git is the single source of truth for cluster state. ArgoCD continuously syncs the cluster to match what’s in Git.

GitOps principles:

Entire system described declaratively in Git
Desired state versioned in Git
Approved changes automatically applied to the cluster
Software agents ensure correctness and alert on divergence

Install ArgoCD:

kubectl create namespace argocd
kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml

# Get initial admin password
kubectl -n argocd get secret argocd-initial-admin-secret -o jsonpath="{.data.password}" | base64 -d

# Port-forward to access UI
kubectl port-forward svc/argocd-server -n argocd 8080:443

Create an ArgoCD Application:

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: web-app-production
  namespace: argocd
spec:
  project: default

  # Source — where your manifests live in Git
  source:
    repoURL: https://github.com/myorg/k8s-configs
    targetRevision: main
    path: apps/web-app/production

  # Destination — where to deploy in the cluster
  destination:
    server: https://kubernetes.default.svc
    namespace: production

  # Sync policy — auto-sync when Git changes
  syncPolicy:
    automated:
      prune: true         # Delete resources removed from Git
      selfHeal: true      # Revert manual changes to cluster
    syncOptions:
    - CreateNamespace=true
    - PrunePropagationPolicy=foreground
    retry:
      limit: 5
      backoff:
        duration: 5s
        maxDuration: 3m
        factor: 2

GitOps workflow:

# Developer makes a change
git checkout -b feature/update-image
# Edit k8s/deployment.yaml — change image tag
git commit -m "deploy: bump web-app to v2.5"
git push origin feature/update-image

# Create PR → review → merge to main
# ArgoCD detects the change within 3 minutes
# ArgoCD applies the change to cluster automatically

# Check sync status
argocd app get web-app-production
argocd app sync web-app-production    # Manual sync if needed
argocd app history web-app-production # Deployment history

Q29

How do you debug a node that is NotReady in Kubernetes?

Advanced

NotReady means the control plane cannot communicate with the node or the node’s conditions are failing.

Immediate diagnosis:

# Step 1 — check node status and conditions
kubectl get nodes
kubectl describe node <node-name>

# Look for conditions:
# Ready = False/Unknown
# MemoryPressure = True
# DiskPressure = True
# PIDPressure = True
# NetworkUnavailable = True

Step 2 — SSH into the node and check:

# Check kubelet status (most common cause)
sudo systemctl status kubelet
sudo journalctl -u kubelet -f --no-pager | tail -50

# Common kubelet errors:
# "failed to get node info" → network issue
# "certificate expired" → renew kubelet certificates
# "PLEG is not healthy" → pod lifecycle event generator issues (often disk pressure)

# Check node resources
df -h              # Disk usage (DiskPressure if >85%)
free -m            # Memory (MemoryPressure)
top                # CPU and process check
ps aux | wc -l     # PID count (PIDPressure if >1000)

Fix common causes:

# Fix 1: kubelet not running
sudo systemctl restart kubelet

# Fix 2: Disk pressure — clean up
docker system prune -af         # Clean Docker images/containers
crictl rmi --prune              # Clean containerd images
sudo journalctl --vacuum-size=500M    # Clean journal logs

# Fix 3: Certificate expired
sudo kubeadm alpha certs renew all
sudo systemctl restart kubelet

# Fix 4: Network plugin not running
kubectl get pods -n kube-system | grep -E "calico|flannel|cilium"
kubectl delete pod -n kube-system <broken-cni-pod>   # Restart CNI pod

# Fix 5: Node has too many pods — eviction happening
kubectl describe node <node> | grep -i "eviction\|pressure"

Cordon and drain a problematic node:

# Prevent new pods from scheduling on this node
kubectl cordon <node-name>

# Move existing pods to other nodes
kubectl drain <node-name> --ignore-daemonsets --delete-emptydir-data --force

# After fixing the node, uncordon it
kubectl uncordon <node-name>

Q30

Design a production-grade Kubernetes cluster architecture for a high-traffic application.

Advanced

A production-grade Kubernetes architecture for high-traffic needs to address availability, security, scalability, and observability.

Cluster architecture:

                    ┌─────────────────────────────────┐
                    │   CONTROL PLANE (HA)             │
                    │   3x master nodes (multi-AZ)     │
                    │   etcd cluster (separate nodes)  │
                    └──────────────┬──────────────────┘
                                   │
         ┌─────────────────────────┼─────────────────────────┐
         ▼                         ▼                          ▼
  ┌─────────────┐          ┌─────────────┐           ┌─────────────┐
  │ AZ-1 Nodes  │          │ AZ-2 Nodes  │           │ AZ-3 Nodes  │
  │ App workers │          │ App workers │           │ App workers │
  │ GPU nodes   │          │ GPU nodes   │           │ Spot nodes  │
  └─────────────┘          └─────────────┘           └─────────────┘

Node pool strategy:

# System node pool — control plane components
nodePool: system
  instanceType: m5.xlarge
  count: 3
  taints: [CriticalAddonsOnly=true:NoSchedule]

# Application node pool — production workloads (on-demand)
nodePool: app-ondemand
  instanceType: m5.2xlarge
  minCount: 6
  maxCount: 50
  availabilityZones: [us-east-1a, us-east-1b, us-east-1c]

# Spot node pool — batch/non-critical workloads (80% cheaper)
nodePool: app-spot
  instanceTypes: [m5.2xlarge, m5.4xlarge, m5a.2xlarge]
  spot: true
  minCount: 0
  maxCount: 100

Production deployment configuration:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-app
  namespace: production
spec:
  replicas: 9              # 3 per AZ
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 3
      maxUnavailable: 0
  template:
    spec:
      # Spread across zones and nodes
      topologySpreadConstraints:
      - maxSkew: 1
        topologyKey: topology.kubernetes.io/zone
        whenUnsatisfiable: DoNotSchedule
        labelSelector:
          matchLabels:
            app: web-app
      - maxSkew: 1
        topologyKey: kubernetes.io/hostname
        whenUnsatisfiable: ScheduleAnyway
        labelSelector:
          matchLabels:
            app: web-app
      # Don't schedule on spot nodes (critical app)
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: node.kubernetes.io/lifecycle
                operator: NotIn
                values: [spot]
      containers:
      - name: web-app
        image: web-app:v3.0
        resources:
          requests:
            cpu: "500m"
            memory: "512Mi"
          limits:
            cpu: "1"
            memory: "1Gi"
        readinessProbe:
          httpGet: {path: /ready, port: 8080}
          periodSeconds: 5
        livenessProbe:
          httpGet: {path: /health, port: 8080}
          periodSeconds: 15
        lifecycle:
          preStop:
            exec:
              command: ["/bin/sh", "-c", "sleep 15"]
      terminationGracePeriodSeconds: 60

Observability stack:

# Metrics — Prometheus + Grafana
helm install kube-prometheus-stack prometheus-community/kube-prometheus-stack

# Logs — EFK or Loki stack
helm install loki grafana/loki-stack --set grafana.enabled=false

# Tracing — Jaeger or Tempo
helm install jaeger jaegertracing/jaeger

# Alerts — configure PagerDuty/Slack in Alertmanager

Key SLOs to monitor:

# Error rate < 0.1%
# P99 latency < 200ms
# Availability > 99.9%
# Pod restart rate < 1/hour
# Node CPU < 70%
# Node Memory < 80%

Found an error or want to suggest a topic?