Terraform infrastructure gets complex with every new deployment. The code helps you to manage the infrastructure uniformly. But as the organization grows, it itself becomes hard to maintain and scale appropriately. In this article, I will be showing how you can grow and manage Terraform workflow at scale.
Scalability has various dimensions. You may want to ship the code quickly, or your priority could be to pivot the resources, or you might be looking to tighten the security. There are multiple approaches to handle the scalability effectively, but we will be analyzing the easily accessible solutions:
- Run Terraform locally
- Integrate into Homegrown Automation
- Open Source Solutions
- Managed Solutions
#1 Run Terraform Locally
You can download Terraform binary directly to your machine from which you are developing code. Having Terraform on the same machine makes it quick to deploy resources. Direct access to the target provider makes it easier to process operations such as state import, move, remove etc., compared to the implementation without direct access. You can use all the features and tools, e.g. Remote State and State Locking, with the team as you scale Terraform from a local machine. There are various tools to perform multiple functions. If you want to keep the code quality intact without having continuous integration set up, you can use pre-commit terraform. Though you can use all the features of Terraform right from your machine, it comes with several drawbacks.
Running Terraform locally means doing each action manually by the developer - which could cause issues due to human fault. It also means wasting time in applying changes as many people will apply their changes to the code simultaneously. It will be difficult to apply another person's changes until the codebase is updated with one person's changes. A bottleneck will form in code testing. Besides delay, this approach opens up several security vulnerabilities.Every person on the team will need access to the provider - which could compromise the environment. Terraform has to be allowed to create, change and destroy resources, it is difficult to restrict the developers' permissions in this model.
And with how Terraform state works, every person needs access to the Terraform state to work with the codebase. This allows team members to run a Terraform State pull to access all the secrets if they wish, even if the data is in Vault. Running Terraform locally could be a good option if you are a one-person team and quickly provision resources.
#2 Integrate into Homegrown Automation
If you already have an in-house CI/CD system running, you can scale the entire structure by integrating Terraform. This way, there is no requirement to give privileged access to developers. They can have read-only permission while proper access would be given to the execution layer. In this model, you can track all the changes as the system logs all the changes in the CI/CD pipeline. You can access the records at any time and in real-time.
Other pipeline processes, such as linting, coding standards, compliance, unit tests, can be configured and moved to the pull request status checks. Though secure, this approach also lacks collaboration. Developers cannot execute concurrent pull requests due to State Locking. When one pull request triggers the pipeline, the subsequent request fails to run, as the State is locked, running off the write permissions. A simple solution is to configure queued runs, but not all the CI/CD products can run into the queue.
Moreover, building and managing an entire pipeline will take more resources. Several essential processes to create an efficient workflow are:
- Planning on pull requests
- Unit tests
- Compliance checks
- Applying once merged
- Periodical drift detection
It would be an additional load on the team to maintain it all in-house with all the configurations. Having an inbuilt automation solution looks tempting, but soon you will face various problems.
Nevertheless, it is a good option if you are highly concerned about security. Everything stays in your control, and you can employ multiple security arrangements.
#3 Open Source Solutions
One reason for Terraform's popularity is its wide-open source library. While scaling, you can utilize these open source tools to add additional features to the Terraform infrastructure.
Several popular tools are:
- Terraformer: A CLI tool to create Terraform files from the existing infrastructure. Let's say your Infrastructure is working excellently, and you want to save a Terraform file for the current state. Terraformer is the tool for you.
- Terratest: It's a Go library to write automated tests for the Infrastructure code. Packed with multiple helper functions and patterns for everyday infrastructure testing tasks, you can use it with Terraform to quickly write tests.
- Terragrunt: Terragrunt is a thin wrapper for Terraform that provides extra tools for keeping your Terraform configurations DRY, working with multiple Terraform modules, and managing remote state.
- Atlantis: Atlantis enables you to automate your Terraform via pull requests.
- Terrascan: Scan compliance and security in your IAC to prevent any violation risk before provisioning cloud-native infrastructure. It offers flexibility to run locally or integrate with your CICD.
- Driftctl: As the code becomes complex, tracking any change in Infrastructure State becomes critical, and that's where Driftctl works. It detects tracks and schedules alerts on infrastructure drift.
- Terraform Switcher: Quickly download and switch between the Terraform version from the command-line tool.
- Infracost: Cloud costs can go out of hand quickly if they are not appropriately managed. Infracost is a handy tool for DevOps and SRE to see the cloud cost estimation and compare different options upfront.
There are over 5000+ open source Terraform projects for a variety of requirements. Be it linting, managing the environment, security tools, test automation, managing cost etc.If you are going completely open-source, there is community support, forums and, most important, tools to help you scale your infrastructure efficiently. However, you may need quick support in case of emergency or errors, which may not be possible with open source. That's where managed solution providers come with their fantastic support and intuitive platforms.
#4 Managed Solutions
The managed solution often comes with a specialized management platform with multiple tools that fill the collaboration, security, speed, and reliability gap. With the stability and powerful features, IAC platforms provide, maintaining the nuances of Terraform infrastructure becomes relatively easy.
The two most popular platforms are:
- Terraform Cloud
Terraform Cloud is provided by HashiCorp - the company behind Terraform itself. With the freemium pricing approach, you can choose the plan your organization needs.
First of all, Terraform Cloud solves the state file issue. You can run the terraform state on your local machine, but it saves and retrieves the state file from Terraform Cloud. Basically, the Terraform Cloud manages the state file for you.
Due to this, state files can have more security, and the team can collaborate much better. In addition, you can grant access to selected users and have granular control over who accesses the state file and more.
Also, it keeps all the versions of the state file and keeps track of all the changes applied.
Having state files secure and safe is critical to efficiently scale Terraform infrastructure, as the entire team will look up to it to manage the workflow. As the team and workload grow, collaboration becomes critical for the sustainable development of any infrastructure.
With a cloud SAAS solution like Terraform Cloud with its centralized resources, help, history of changes, controlled access, and support - the team can focus on delivering the code.
Spacelift is a highly flexible platform that integrates with Terraform and other Infrastructure as Code tools. It uses open source technologies to enable flexibility and customization in Infrastructure management.
For example, it uses Open Policy Agent - an open-source policy engine that other products integrating with the Terraform (Kubernetes, Kafka, etc.) also incorporate. Using a similar tool, the focus could be on providing compliance instead of learning a new syntax.
One fantastic feature that Spacelift has that Terraform Cloud lacks is to schedule a periodic drift detection on any stack. As the infrastructure grows, detecting whether a stack's actual configuration differs, or has drifted, from its expected configuration becomes necessary to prevent any deviations. Of course, you can use Driftctl to track the drifting, but the solution is inbuilt in Spacelift. As Terraform Cloud manages the State file, you have to access the provider first to import something directly from the State file. With Spacelift, you do not need additional permission to run any command.
Spacelift is similar to Terraform Cloud but has slightly more features and tools to manage Terraform efficiently.
Which one to pick?
You can efficiently scale Terraform with:
- Running Terraform locally: If your product demands quick changes, you have a one-person or small team. As the team grows, collaboration becomes difficult.
- Homegrown Automation: Highly effective if you want total control and access over the Terraform and your Infrastructure. But setting and maintenance require resources.
- Open Source solution: To bring the flexibility and configuration of open source features connected with Terraform. However, updates, upgrades, security, and errors have to be appropriately managed.
- Managed Solutions: For the organization that wants reliable and plug & play Terraform solution and support.
As Terraform makes it easier to manage the Infrastructure, there are methods to manage growing Terraform requirements effectively.
In this article, I suggested four methods.However, which one you choose depends on your organization's requirements. So, analyze your technical requirement and the scalability's focus, and pick the suitable method. If there is any question or doubt, please leave it in the comment section.
DevOps Community Manager at Spacelift
Jacob is a DevOps Engineer based in Berlin currently working as DevOps Community Manager at Spacelift. He has worked with cloud and DevOps technologies for the last four years. He is passionate about DevOps, cloud industry and community building. In his free time he enjoys hiking, cycling, and biography books.