Lessons Learned Testing the ‘Configure Backup on VMs’ Azure Policy

Recently, I was doing some testing with the ‘Configure backup on VMs of a location to an existing central Vault in the same location’ Azure Policy and wanted to share some interesting lessons learned.

Backup Related Azure Policies

Firstly, there are currently only 2 backup-related built-in policies available, namely:

  • Azure Backup should be enabled for Virtual Machines
  • Configure backup on VMs of a location to an existing central Vault in the same location

The first policy listed (the ‘Azure Backup should be enabled for Virtual Machines’) is an AuditIfNotExists policy, meaning it won’t actually change/modify anything. The second policy (the ‘Configure backup on VMs of a location to an existing central Vault in the same location’) also has an ‘AuditIfNotExists’ option but also includes a ‘DeployIfNotExists’ available effect as well. This means it can actually be used to change the target resource.

I’ve already written another blog post on how to use the ‘Azure Backup should be enabled for Virtual Machines’ policy in a more customized and targetted way. You can check out that blog post here: Auditing for Backup with Azure Policy.

The description of this ‘Configure backup on VMs’ (for short) policy, states:

This policy configures Azure Backup protection on VMs in a given location to an existing central vault in the same location. It applies to only those VMs that are not already configured for backup. It is recommended that this policy is assigned to not more than 200 VMs. If the policy is assigned for more than 200 VMs, it can result in the backup getting triggered a few hours beyond the defined schedule. This policy will be enhanced to support more VM images.

Policy Assignment

If you assign this policy, you will notice in the Parameters, that we specify the location, Recovery Services Vault (RSV), and the RSV Backup Policy.

Also, on the Remediation tab, I include the creation of a ‘remediation task’. Notice that the remediation of this policy, because it uses ‘DeployIfNotExists’, includes a Managed Identity (this will be important later). You want to ensure that you select the appropriate location for the Managed Identity based on the location that you set in the Parameters location (ie. you don’t want your Managed Identity to be in Europe while your resources are in Canada). Also, notice the permissions that are given to the Managed Identity. In this policy’s case, it’s the Virtual Machine Contributor, and Backup Contributor roles.

Testing Results (and Discovery)

In my testing, I was setting the Scope of the policy assignment to a specific Resource Group versus my entire Subscription, since, well, I was testing it. This will be an important point to remember later on.

When you create a Policy Assignment, it will produce an Assignment ID like this: /subscriptions/XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX/resourceGroups/{ResourceGroup}/providers/Microsoft.Authorization/policyAssignments/{PolicyAssignmentID}

After the Policy was assigned, I patiently waited for it to be evaluated, and checked the status of the Remediation Task. As you can see in the screenshot, the remediation task failed! So, of course, we have to dig into it to find out why.

To get to the remediation task error, you need to navigate to Assignment Details > Remediation Tasks and click on the failed task itself. From there, you have to click on ‘Related events’ in order to see the deployment error.

Unfortunately, the Deployment Error side-blade isn’t quite user-friendly (as in, you’ll have to scroll horizontally to read the whole error message).

If you look at the detailed error message, it states:

Deployment failed with multiple errors: ‘Authorization failed for template resource ‘DeployProtection-7gzwrcxarsfmm’ of type ‘Microsoft.Resources/deployments’. The client ‘671dc22a-af16-4b16-9bda-51b2bfc04bef’ with object id ‘671dc22a-af16-4b16-9bda-51b2bfc04bef’ does not have permission to perform action ‘Microsoft.Resources/deployments/write’ at scope ‘/subscriptions/XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX/resourceGroups/eastusvault/providers/Microsoft.Resources/deployments/DeployProtection-7gzwrcxarsfmm’.:Authorization failed for template resource ‘rsv/Azure/iaasvmcontainer;iaasvmcontainerv2;AzureLab;backuptestvm01/vm;iaasvmcontainerv2;AzureLab;backuptestvm01’ of type ‘Microsoft.RecoveryServices/vaults/backupFabrics/protectionContainers/protectedItems’. The client ‘671dc22a-af16-4b16-9bda-51b2bfc04bef’ with object id ‘671dc22a-af16-4b16-9bda-51b2bfc04bef’ does not have permission to perform action ‘Microsoft.RecoveryServices/vaults/backupFabrics/protectionContainers/protectedItems/write’ at scope ‘/subscriptions/XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX/resourceGroups/EastUSVault/providers/Microsoft.RecoveryServices/vaults/rsv/backupFabrics/Azure/protectionContainers/iaasvmcontainer;iaasvmcontainerv2;AzureLab;backuptestvm01/protectedItems/vm;iaasvmcontainerv2;AzureLab;backuptestvm01′.’

Notice the areas I bolded/highlighted. We have a Client ID and Object ID, and a permissions issue. Why is this?

Remember in the Policy Assignment, I assigned the policy to a target Resource Group (“AzureLab” in this instance) for testing? Do you also recall how I selected a specific Recovery Services Vault (RSV)? Notice that the RSV is actually contained in a different Resource Group (“EastUSVault” in this case)?

Here’s something that I discovered…

If you associate the Policy to a specific Resource Group and configure it for an RSV that is in the same Resource Group as the VM, it works successfully.

However, if you use the policy, and configure it for a Recovery Services Vault (RSV) that is NOT the in the same Resource Group as the VM, it fails (with the error mentioned above).

This latter configuration is what I would consider the most common configuration. Most organizations would not have the Recovery Services Vault (RSV) deployed in the same Resource Group as the Virtual Machines. Especially so, when there can be many VMs, that are grouped by project, environment, etc. You would end up with RSV sprawl.

So, how do we overcome this?

The Solution

Here’s what I had to do…

Put simply, the Managed Identity for the remediation task does not have permissions on the Recovery Services Vault (RSV) to create a new Protected Item entry. And the Policy doesn’t take into consideration that the RSV may be in a different Resource Group.

Using the Policy Assignment Object ID from the error message (highlighted above), I was able to locate the Managed Identity object in Azure AD. Notice that the ‘Name’ actually corresponds to the Policy Assignment ID.

Then, I added this Managed Identity object, to the Resource Group that contains the RSV, with ‘Backup Contributor’ permissions. Of course, you could assign the role directly on the Recovery Services Vault (RSV) versus the Resource Group it’s contained in.

To test the permission changes, I then manually triggered the Remediation Task, which then was able to complete successfully.

To Sum Up

The Managed Identity inherits the permissions from the selected scope of the Policy. This is the expected behavior (so I’m told).

This means that…

  • If you set the Policy scope to a Resource Group, then both the Virtual Machine and Recovery Services Vault (RSV) should be in the same Resource Group (or at least the Managed Identity expects that). However, we’ve seen a workaround to this (via the generated Managed Identity being granted ‘Backup Contributor’ permissions on the RSV).
  • If you set the Policy scope to a Subscription, then both the Virtual Machine and Recovery Services Vault (RSV) can be in any Resource Group within that Subscription.

Here’s a good reference article: How remediation security works

Hopefully, this helps you in your testing of the ‘Configure backup on VMs of a location to an existing central Vault in the same location’ Azure Policy.

Stay tuned for another blog post soon, on customizing this policy further!

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *