Terraform Is NOT the Cloud Agnostic Platform You Think It Is!

I’ve been working with the Cloud, and more specifically, Azure, for many years. And things undoubtedly have progressed.

From back when we used to have to modify XML files for networking (yes, that was a thing), to ARM and PowerShell, to now cross-platform command line (CLI) and Bash, etc. The way we interact with and deploy to Azure continues to advance.

And in the ever-growing world of DevOps and Infrastructure-as-Code (IaC), there is an endless list of tools that can be used to accomplish this.

 

Although I am primarily focused on Azure, and with it, fulfilling the Infrastructure-as-Code (IaC) need through ARM templates, lately, I have been doing some work with Terraform.

Recently, I was working with a Client that, although all of their infrastructure is running in Azure, they chose to use Terraform in order to be “cloud agnostic.” OK, fair enough. I can understand that they want options.

But, I’m going to describe why I prefer to use the cloud-native template language available, for the best (and most up-to-date) experience possible, and why I feel an Infrastructure-as-Code (IaC) platform/language like Terraform, although touted as being “cloud agnostic” really isn’t. Let’s dig in.

 

Providers

First, let’s start with Providers. In Terraform, the Provider reference basically tells Terraform what Application Programming Interface (API) it will call to deploy your template/code.

It looks something like this:

provider "azurerm" { }
provider "aws" {}

That’s fine, I’ll give them that one. After all, you need to point it at least to one cloud to deploy (along with Subscription information, etc.)

However, the problem is that every time Azure or AWS releases a new feature (and new APIs), someone must write the Provider — whether that’s HashiCorp or the open source community — this leads to a constant “chasing the tail” issue. I expand on this later in the ‘Feature Releases’ section.

 

Resource Deployment

Obviously, in Infrastructure-as-Code (IaC), the whole premise is deploying your infrastructure through the use of code (versus through a GUI, etc.).

In Azure, you have Azure Resource Manager (ARM). In Amazon Web Services (AWS) you have Cloudformation (CFN). Both of these actually leverage JavaScript Object Notation (JSON).

In ARM, the template structure and syntax for describing a resource through code looks something like this:

ARM – Structure Syntax

But in Terraform, when you want to deploy an Azure resource, it’s much cleaner/simpler, like so:

Terraform – Structure Syntax Example

But to illustrate our point, here’s the same resource (i.e. a Storage Account), but deployed to Amazone Web Services (AWS) via a Terraform template:

Terraform – AWS – Storage Template – Example

Notice anything? Both code blocks start with:

resource {
}

But the rest of the code is completely different. Case in point, the “azurerm_ …” and “aws_ …” parts.

resource "azurerm_storage_account"
resource "aws_s3_bucket" 

So, even if you are actually deploying resources to multiple cloud providers, you, in fact, have to write your Terraform template using the cloud-specific resource provider, and resource references. Therefore, you are going to have 2 sets of “agnostic” templates, one for Azure and another for AWS.

If Terraform were truly a “cloud agnostic” platform and language, then I would expect the template to simply be something like this (notice the absence of either ‘azurerm’ or ‘aws’ specific references and terminology):

resource "storage_account" "sa" {
name = "adinermie1984"
resource_group_name = "${resource_group.<>.name}"
location = "CanadaCentral"
storage_class = "Standard"
replication_type = "LRS"
}

 

Feature Releases

Now on to our next point.

When a Cloud Provider releases a new feature, even in Private or Public Preview, they generally make available options for automated/scripted deployments (like in ARM or CFN).

This is through an update in the API, which normally comes in the form of a new API version (which may or may not include the “-preview” label). For example, here is the list of API versions available for the Microsoft.Compute Resource Provider, specifically for the Virtual Machine Resource Type. Notice how some of the APIs have “-preview” in them.

ARM – Compute – VM – API Versions

So, what does this mean? It means, when Microsoft, or Amazon, or whoever releases a new feature, you can try it immediately (via a new/preview/updated API).

In Terraform, you must wait for a Provider to be updated in order to use the new feature. But if you absolutely cannot wait, you can still use Terraform to try out the new feature… you just have to deploy an ARM template embedded within your Terraform template!

Terraform – Embedded ARM Template

In addition, even when a Provider is written or updated, it does not mean it’s complete. For example, you can use a native Terraform template to deploy a Recovery Services Vault (RSV), but you cannot deploy an Azure Backup Policy into that RSV! This is a case where you need to use the embedded ARM template.

I encountered this issue while working with this Client, and it was ironic to find out that I had to write an ARM template, to deploy and RSV (complete with a custom Backup Policy), but through/within a Terraform template. When I asked the Client why this was (as they were more experienced/versed in Terraform that I was), their answer was “because no one has written the Provider for that yet.”

Now, I’m not knocking on HashiCorp as their developers are very busy (along with all the other community contributors as well). But, I would expect if you’re writing the Provider for a specific piece of technology (like the RSV), that you would first write the primary module (to deploy the RSV), and then move onto the other inter-related sub-components immediately after.

 

Conclusion

These are just some of the things I’ve noticed as I’ve started to work with Terraform for deploying Infrastructure-as-Code (IaC). I’m not saying we shouldn’t use it, its syntax is very clean and easy to read. And I do really like the terraform destroy command for ease of cleaning up my test deployments.

All I’m saying is, people should not state that they want to use a “cloud agnostic” templating language when in reality, it isn’t.

For me personally, if I’m working with Azure, then I choose to just use the cloud-native template language available, for the best (and most up-to-date) experience possible.

%d bloggers like this: