DevOps Interview Questions – Industry Standard (25 Q&A)

Designed for L2 / L3 DevOps, SRE, Platform Engineer interviews. Covers architecture, CI/CD design, production failures, LAB commands, and field scenarios.

1️⃣ DevOps Fundamentals & Architecture

DevOps Lifecycle
Q1. Explain DevOps from an enterprise architecture perspective.
  1. Operating model, not a toolset
  2. Git as single source of truth
  3. Automated CI/CD
  4. Infrastructure as Code
  5. Observability-driven feedback
Q2. Scenario: Deployment is successful but users see 500 errors.
Production deployment is green, but customers report application failures.
  1. Check logs
  2. Verify configs & secrets
  3. Validate dependencies
  4. Review dashboards
  5. Rollback if needed
Q3. Explain the CALMS model.
Culture, Automation, Lean, Measurement, Sharing.
Q4. How does DevOps reduce MTTR?
Monitoring → Alert → Runbook → Auto-rollback → Postmortem.
Q5. Common DevOps anti-patterns?
Tool-only DevOps, silos, manual gates, no prod ownership.

2️⃣ CI/CD – Design & Troubleshooting

CI/CD Pipeline
Q6. Design a CI/CD pipeline for microservices.
  1. Git commit
  2. SAST
  3. Tests
  4. Docker build
  5. Security scan
  6. Staging deploy
  7. Canary / Blue-Green
Q7. LAB: Jenkins logs
kubectl logs jenkins-0 -n jenkins
Q8. Pipeline passes but prod fails?
Missing secrets, env mismatch, wrong image tag, resource limits.
Q9. CI/CD security?
Vault, RBAC, artifact signing, pipeline isolation.
Q10. What is GitOps?
Git defines desired state; systems self-reconcile.

3️⃣ Docker & Containers

Docker Architecture
Q11. Why containers?
Eliminate drift, speed up delivery.
Q12. LAB: Debug container
docker ps -a
docker logs <id>
docker exec -it <id> /bin/sh
Q13. Works locally, fails in K8s?
Missing ConfigMaps, secrets, wrong limits.
Q14. Optimize images?
Multi-stage builds, slim images.
Q15. Docker networking?
Bridge, Overlay, Host.

4️⃣ Kubernetes

Kubernetes
Q16. K8s architecture?
Control plane + worker nodes.
Q17. LAB: Pod debug
kubectl get pods
kubectl describe pod <pod>
kubectl logs <pod>
Q18. Service unreachable?
Selectors, endpoints, ingress, policies.
Q19. Zero downtime?
Rolling updates, probes, replicas.
Q20. StatefulSet vs Deployment?
Stateful DBs vs stateless apps.

5️⃣ Infrastructure as Code

Terraform
Q21. Why IaC?
Consistency, audit, DR.
Q22. LAB: Terraform
terraform init
terraform plan
terraform apply
Q23. Terraform deleted prod?
Wrong workspace, state issues.
Q24. Traffic spike?
Autoscaling, caching, rate-limit.
Q25. Telecom-grade DevOps?
HA, zero downtime, observability, auto-recovery.
🌙