Terraform interviews test whether you can manage state across 200 resources without causing a production outage — not just write a basic EC2 module.
Practice with AI Interviewer →A Terraform Engineer owns the Infrastructure as Code (IaC) platform and standards. Unlike Cloud Engineers who use Terraform as one tool, Terraform Engineers specialise in HCL, modules, state management, policy as code, and CI/CD pipelines for infrastructure. This role differentiates from DevOps Engineers (who focus on broader deployment and CI/CD) and Platform Engineers (who build self-service platforms using IaC foundations). These 40+ questions test whether candidates can architect, maintain, and secure Terraform at scale.
This guide covers behavioural questions (team collaboration, incident management), core technical areas (HCL and modules, multi-environment deployment, policy and testing), and real-world scenarios. Use these to assess candidates' understanding of state locking, remote backends, workspaces, providers, Terragrunt, Sentinel policy as code, and infrastructure testing.
Answer state management, module design, CI/CD pipeline, and policy as code questions on camera with timed responses. Get AI feedback on your infrastructure thinking and Terraform architecture decisions.
Start a Mock Interview →Local state without locking enables concurrent applies that corrupt infrastructure, causing outages. Teams lose sync between code and reality. How to fix: Always use remote state with locking (S3+DynamoDB, Terraform Cloud, Consul). Treat state as read-only in most cases. Document state recovery procedures and test them.
Easy to accidentally apply to wrong workspace, affecting prod. Hard to code review and version control per-environment changes. Not suitable for teams. How to fix: Use separate Terraform code directories or variable files per environment. Workspaces are useful only for temporary, isolated testing. Enforce environment separation via CI/CD branch protection.
Modules become rigid, hard to reuse, and difficult to test. Small changes force large rebuilds. Difficult to apply different settings per environment. How to fix: Follow single responsibility: one module = one logical component (e.g., VPC module, RDS module, security group module). Compose modules for complex resources. Add variables for common customisations.
Infrastructure breaks in prod due to undetected issues. Policy violations (unencrypted resources, missing tags) slip through. Cost estimation surprises appear post-deployment. How to fix: Implement Terratest for module verification, tflint for linting, checkov for security, and Infracost for cost estimation. Run all checks in CI before plan. Require peer review of plans.
Deep understanding of Terraform state: how it works, why remote state with locking is critical for teams, and how to recover from corruption
Module design and composition: ability to write reusable, testable modules and explain trade-offs between monolithic and modular approaches
Multi-environment strategy: how to safely deploy across dev, staging, prod without duplicating code, using separate state files or workspaces appropriately
CI/CD pipeline design: ability to architect a safe Terraform pipeline with linting, validation, planning, approval gates, and policy enforcement
Provider knowledge: experience with at least 2–3 major providers (AWS, Azure, GCP, Kubernetes, Helm) and ability to chain them in real scenarios
Policy as code (Sentinel, OPA): understanding of how to enforce governance (cost tags, security rules) without blocking legitimate deploys
Cost awareness: knowledge of cost estimation tools (Terraform Cloud, Infracost) and how to prevent surprise expenses
Incident response: ability to handle state corruption, rollbacks, and concurrent apply failures with clear recovery procedures
Security best practices: how to manage secrets, avoid hard-coded credentials, use encrypted backends, and enforce secure defaults
Testing and validation: experience with Terratest, tflint, checkov, and ability to explain how to test infrastructure code effectively
Never commit secrets to version control. Use environment variables injected via CI/CD secrets manager (GitHub Secrets, GitLab CI/CD Variables). Store sensitive values in HashiCorp Vault, AWS Secrets Manager, or Terraform Cloud variable store. Mark variables as 'sensitive' in Terraform to prevent logging. Encrypt backend state at rest.
First, pull a recent backup of the state file (if available from S3 versioning or Terraform Cloud snapshots). If no backup exists, use terraform import to re-import critical resources into a fresh state. Validate each resource was imported correctly before discarding the corrupted state. Consider implementing automated backups and regular state validation checks.
Terraform Cloud is Terraform's SaaS platform, offering remote state, locking, policy as code (Sentinel), cost estimation, and team management. Terraform Enterprise is the self-hosted version with additional features like audit logging, SAML SSO, and on-premises deployment. Choose Cloud for quick setup; Enterprise for compliance-heavy environments requiring full control.
Use Terraform Cloud/Enterprise teams and workspace-level permissions: assign teams to workspaces with 'admin', 'write', or 'read-only' roles. Separate state files per team or project. Enforce SSH keys or OIDC tokens (never hardcoded credentials). Use IAM policies to limit cloud provider access. Audit all Terraform operations via CloudTrail or similar.
Terragrunt is a thin wrapper around Terraform that reduces code duplication in multi-environment setups. It automates terraform init, adds dependency management between modules, and manages remote state configurations. Use it when managing many similar environments with minor customisations. Without Terragrunt, you'd repeat the same backend and variable configuration across directories.
Use lifecycle { prevent_destroy = true } on critical resources (databases, load balancers). Require manual approval gates in CI/CD before any destroy operation. Implement separate IAM roles for prod (deny destroy without approval). Log all destroy operations. Consider using aws_s3_bucket_object_lock on state file buckets to prevent accidental state deletion.
Terraform (AWS provider) provisions EKS clusters; Helm provider deploys Kubernetes packages (Prometheus, Nginx Ingress) on that cluster. Use both when building complete Kubernetes platforms: Terraform manages infrastructure (nodes, networking); Helm manages application deployments. Separate concerns: Terraform engineers manage cluster, platform engineers manage app deployments.
Enable S3 versioning and cross-region replication for state files. Maintain automated backups of state (e.g., daily snapshots). Document and regularly test terraform import procedures to rebuild state if lost. Keep infrastructure code in Git with full history. Run terraform plan regularly (daily) to detect state drift. Use data sources to detect manual changes and correct them.
Simulate a real Terraform engineer interview with your camera on. Face role-specific questions tailored to your resume (state management, module design, CI/CD pipelines), answer under time pressure, and get AI feedback on your architecture decisions and IaC best practices.
Start a Mock Interview →Takes less than 15 minutes.