DevOps Interview Questions – Industry Standard (25 Q&A)
Designed for L2 / L3 DevOps, SRE, Platform Engineer interviews.
Covers architecture, CI/CD design, production failures, LAB commands, and field scenarios.
1️⃣ DevOps Fundamentals & Architecture
Q1. Explain DevOps from an enterprise architecture perspective.
- DevOps is an operating model, not a toolset.
- Git is the single source of truth.
- CI/CD automates build, test, deploy.
- Infrastructure is managed via IaC.
- Monitoring and feedback close the loop.
Q2. Scenario: Deployment is successful but users see 500 errors.
Production deployment is green, but customers report application failures.
- Check application logs
- Verify config and secrets
- Check downstream dependencies
- Review monitoring dashboards
- Rollback if required
Q3. Explain the CALMS model.
Culture, Automation, Lean, Measurement, Sharing – applied together in real teams.
Q4. How does DevOps reduce MTTR?
Monitoring → Alert → Runbook → Auto-rollback → Postmortem.
Q5. Common DevOps anti-patterns seen in enterprises?
Tool-driven DevOps, siloed teams, manual approvals, no ownership of production.
2️⃣ CI/CD – Design & Troubleshooting
Q6. Design a CI/CD pipeline for microservices.
- Code commit to Git
- Static analysis (SAST)
- Unit & integration tests
- Docker image build
- Security scan
- Deploy to staging
- Canary / Blue-Green production deploy
Q7. LAB: Check Jenkins pipeline logs
kubectl logs jenkins-0 -n jenkins
Q8. Scenario: Pipeline passes but production fails.
Build and tests passed, but pods crash after deployment.
- Missing secrets
- Environment variable mismatch
- Incorrect image tag
- Resource limits differ from staging
Q9. How do you secure CI/CD pipelines?
Vault-based secrets, RBAC, artifact signing, pipeline isolation.
Q10. What is GitOps and why companies adopt it?
Git defines desired state; systems reconcile actual state automatically.
3️⃣ Docker & Containerization (Production Reality)
Q11. What real problems do containers solve?
Environment drift, dependency conflicts, slow deployments.
Q12. LAB: Debug a crashing container
docker ps -a
docker logs <container_id>
docker exec -it <container_id> /bin/sh
Q13. Scenario: Container works locally but fails in Kubernetes.
ConfigMaps missing, secrets not mounted, wrong resource limits.
Q14. How do you optimize Docker images?
Multi-stage builds, minimal base images, remove build tools.
Q15. Explain Docker networking modes.
Bridge (single host), Overlay (multi-host), Host (high performance).
4️⃣ Kubernetes – Design & Failure Handling
Q16. Explain Kubernetes architecture end-to-end.
Control plane (API server, scheduler, controllers) and worker nodes (kubelet, runtime).
Q17. LAB: Troubleshoot a failing pod
kubectl get pods
kubectl describe pod <pod-name>
kubectl logs <pod-name>
Q18. Scenario: Service is running but not reachable.
Check selectors, endpoints, ingress, and network policies.
Q19. How does Kubernetes ensure zero downtime?
Rolling updates, readiness probes, multiple replicas.
Q20. StatefulSet vs Deployment – real use case?
Databases use StatefulSet; stateless apps use Deployment.
5️⃣ Infrastructure as Code & Field Scenarios
Q21. Why is IaC mandatory in large enterprises?
Consistency, auditability, faster disaster recovery.
Q22. LAB: Terraform workflow
terraform init
terraform plan
terraform apply
Q23. Scenario: Terraform apply deleted production resources.
Wrong workspace, bad state file, missing lifecycle rules.
Q24. Scenario: Sudden traffic spike brings system down.
Autoscaling, caching, rate limiting, DB tuning.
Q25. How would you design DevOps for telecom-grade systems?
High availability, zero downtime, strong observability, automated recovery.