Managing drift is one of the biggest operational challenges for teams deploying to the cloud. It can be challenging even when using sophisticated deployment tools like Terraform. In this article, I'll look at some of the most popular tools for managing drift.
Drift detection in Terraform
At TinyStacks, we're big on Infrastructure as Code (IaC). IaC enables you to launch your application in a rapid and repeatable manner. This enables not only rapid production deployments, but the creation of a multi-stack, multi-environment deployment process that results in higher quality deployments with less (ideally, zero) downtime.
Drift occurs in any IaC framework when your system as represented in code differs from an environment's actual runtime state. This occurs when your team makes changes directly or indirectly to an environment without capturing them first in code.
Terraform is a great tool for creating and managing IaC deployments. (Many teams find it's even better when you add in Terragrunt, which adds multiple features for reducing code duplication and managing state.)
But even Terraform isn't impervious to the vexing issues caused by drift. That's why Terraform builds some basic drift detection into its toolset.
Why drift occurs in Terraform
By far the biggest culprit of drift is overprivileged access to cloud platforms. If team members retain access to your production system (either via the console or command-line tools), many will end up using it because "it's easier".
The best way to solve this is to put processes and deployment systems in place to ensure that no one needs to touch production except in extremely rare, emergency circumstances. Nearly all changes should go through a formal promotion process to production. This ensures that all changes are captured in code first. It also ensures changes are vetted in pre-production environments before going live.
Drift can occur for other reasons, though. Code that makes changes to infrastructure components can result in a deviation from what's written and what's running. Or a team member might forget to push a change all the way through to production.
Additionally, teams just starting on their DevOps journey may be in a "hybrid mode" as they transition from manually deploying a monolithic app to automatically deploying microservices and modules. It's inevitable during such times that Site Reliability Engineers, DevOps Engineers and others will make changes directly to a production environment.
Advantages of using Terraform tools for managing drift
Drift detection isn't something you can do just by glancing over your Terraform files and comparing them to your production deployment. That won't work for anything except the most rudimentary app. Your team needs a tool it can use to automate detection and alerting.
Drift detection tools also enable continuous monitoring. Once configured, you can set up your drift detector to run either as part of your CI/CD pipeline or on a set schedule. Such continuous monitoring ensures you detect severe drift issues before they lead to downtime or breaches.
What to look for in Terraform tools
When evaluating a tool to assist with drift in Terraform, there are a few capabilities to keep in mind.
The major feature of Terraform is that it enables using a single language and toolset to deploy to multiple cloud providers (AWS, Google Cloud Platform, Microsoft Azure, etc.). A tool for Terraform drift detection should also be able to handle drift detection across the same set of cloud providers that your team supports.
Cloud access required
The level of cloud account access required for Terraform drift detection is also a major consideration. You don't want to require that a team member have full admin rights to your various cloud accounts to run drift detection. More than likely, that will just make your drift management problems worse.
The best practice here is to use a tool that only requires read-only permissions to the specific cloud services used in your stack. Sadly, the default Terraform drift detection capability requires something closer to full administrator rights.
IaC systems supported
It's very likely your company doesn't just use one single IaC framework like Terraform. If you work for a mid- to large-sized company, or are an ISV managing multiple customer deployments, it's likely that you're using systems like CloudFormation or Serverless Framework for some of your deployments.
Terraform drift detection is good...for your Terraform deployments. But what do you do about your other deployments? You should verify that the drift detection tool you choose supports all of your target IaC frameworks.
Another issue is the level of support you'll receive from a vendor for their Terraform drift detection toolset. This can be challenging, as drift detection isn't easy. In fact, it's so difficult that at least one vendor - CloudQuery - decided to throw in the towel on its Terraform drift detection tool.
Before you commit to a tool, see how long it's been around. Ask the maintainers what priority they give to the feature and what future plans they have for improving it. If their answers are vague or evasive, you may want to shop around some more.
Tools for managing Terraform drift
As I mentioned, Terraform contains its own built-in support for detecting drift. However, several other tools exist to take Terraform drift detection and management to the next level. Let's look at a few below.
TinyStacks (that's us!) enables customer success teams to deploy infrastructure to any customer's cloud account and manage all of their deployments through a single pane of glass. With TinyStacks, you can drive all of your infrastructure deployments through IaC and enable customers to uptake stack updates with a single button click. You can eliminate the issue of drift entirely by managing all deployments automatically.
TinyStacks is largely tools-agnostic and can support Terraform, CloudFormation, API-driven deployments, or any other IaC framework that your team currently supports. Contact us today and ask us how we can help you manage your customer cloud deployments more easily.
BridgeCrew has a drift detection feature that supports all three major clouds.
BridgeCrew built their drift detection feature to address what it saw as the shortcomings of Terraform drift detection. BridgeCrew's solution doesn't require over-privileged access and it doesn't depend on reading Terraform state files that might contain secrets (such as password, access keys, etc.).
You can implement BridgeCrew drift detection by integrating the company's Yor command line tool into your Continuous Integration (CI) pipeline. Yor implements a consistent set of tags on your cloud infrastructure. It then uses this information to aid in drift detection. Yor works across resources deployed by Terraform, CloudFormation, and the Serverless framework.
Once you've implemented Yor, BridgeCrew can give you a visual synopsis of view into any changes it's detected. It'll let you review the change, decide if it's necessary to fix, and even generate a pull request into your Git repo so you can ship a fix.
Snyk / driftctl
Like BridgeCrew, Snyk has integrated their own drift detection into their IaC offering, Snyk IaC.
Snyk's primary goal is improving security of your DevOps CI/CD pipelines. So it makes sense that they'd tackle drift detection as well. As I discussed above, a case of drift can quickly became a costly and devastating security issue.
Setup for Snyk is easy - you just have to select an existing project or Git repo that uses either Terraform or Kubernetes. Snyk will then detect drift using its open-source driftctl tool.
If you don't want to subscribe to Snyk, you're also free to use the driftctl tool directly. The Git repo for driftctl points to documentation and tutorial videos to help you get started. The advantage of using Snyk IaC is that it will run both driftctl as well as a battery of security checks against your IaC pipeline. With DevOps pipelines increasingly targeted as an attack vector by malicious actors, such checks can pay dividends in the long run.
The TinyStacks difference
Managing drift is hard - whether it's in Terraform or any other IaC platform. It's even harder when you're managing drift for dozens of clients spread across multiple on-prem and cloud deployments.
We've built TinyStacks to help reduce drift across deployments and give your customers click-button access to application stack updates. Contact us today to get the ball rolling.