Pages

Showing posts with label Azure Site Recovery. Show all posts
Showing posts with label Azure Site Recovery. Show all posts

Friday, July 30, 2021

Automating Azure Site Recovery Recovery Plan Test Failover with PowerShell Script (on-premise VMs to Azure)

I’ve recently been asked by a colleague whether I had any PowerShell scripts that would automate the test failover and cleanup of Azure Site Recovery replicated VMs and my original thought was that there must be plenty of scripts available on the internet but quickly found that results from Google were either the official Microsoft documentation that goes through configuring ASR, replicate, and only failover over one VM (https://docs.microsoft.com/en-us/azure/site-recovery/azure-to-azure-powershell) or blog posts that provided bits and pieces of information and not a complete script.

Having been involved in Azure Site Recovery design, implementation and testing, I have created a PowerShell script to initiate the failover of a recovery plan and then perform the cleanup when the DR environment has been tested. This post serves to share the script that I use and I would encourage anyone who decides to use it to improve and customize the script as needed.

Environment

The environment this script will be used for will have the source as an on-premise and target in Azure’s East US region. The source environment are virtual machines hosted on VMware vSphere.

Requirements

  1. Account with appropriate permissions that will be used to connect to the tenant with the Connect-AzAccount PowerShell cmdlet
  2. Recovery Plan already configured (we’ll be initiating the Test failover on the Recovery Plan and not individual VMs).
  3. The Subscription ID containing the servers being repliated
  4. The name of the Recovery Services Vault containing the replicated VMs
  5. The Recovery Plan name that will be failed over
  6. The VNet name that will be used for the failover VMs

Script Process

  1. Connect to Azure with Connect-AzConnect
  2. Set the context to the subscription ID
  3. Initiates the Test Failover task for the recovery plan
  4. Wait until the Test Failover has completed
  5. Notify user that the Test Failover has completed
  6. Pause and prompt the user to cleanup the failover test VMs
  7. Proceed to clean up Test Failover
  8. End script

I have plans in the future to add additional improvements such as accepting a subscription ID upon execution, providing recovery plan selection for failover testing, or listing failed over VM details (I can’t seem to find a cmdlet that displays the list of VMs and its status in a specified Recovery Group).

Script Variables

$RSVaultName = <name of Recovery Services Group> - e.g. "rsv-us-eus-contoso-asr"

$ASRRecoveryPlanName = <name of Recovery Plan> - e.g. "Recover-Domain-Controllers"

$TestFailoverVNetName = <Name of VNet name in the failover site the VM is to be connected to> - e.g. "vnet-us-eus-dr"

The Script

The following is the script:

Connect-AzAccount

Set-AzContext -SubscriptionId "adae0952-xxxx-xxxx-xxxx-2b8ef42c9bbb"

$RSVaultName = "rsv-us-eus-contoso-asr"

$ASRRecoveryPlanName = "Recover-Domain-Controllers"

$TestFailoverVNetName = "vnet-us-eus-dr"

$vault = Get-AzRecoveryServicesVault -Name $RSVaultName

Set-AzRecoveryServicesAsrVaultContext -Vault $vault

$RecoveryPlan = Get-AzRecoveryServicesAsrRecoveryPlan -FriendlyName $ASRRecoveryPlanName

$TFOVnet = Get-AzVirtualNetwork -Name $TestFailoverVNetName

$TFONetwork= $TFOVnet.Id

#Start test failover of recovery plan

$Job_TFO = Start-AzRecoveryServicesAsrTestFailoverJob -RecoveryPlan $RecoveryPlan -Direction PrimaryToRecovery -AzureVMNetworkId $TFONetwork

do {

$Job_TFOState = Get-AzRecoveryServicesAsrJob -Job $Job_TFO | Select-Object State

Clear-Host

Write-Host "======== Monitoring Failover ========"

Write-Host "Status will refresh every 5 seconds."

try {

    }

catch {

Write-Host -ForegroundColor Red "ERROR - Unable to get status of Failover job"

Write-Host -ForegroundColor Red "ERROR - " + $_

        log "ERROR" "Unable to get status of Failover job"

        log "ERROR" $_

exit

    }

Write-Host "Failover status for $($Job_TFO.TargetObjectName) is $($Job_TFOState.state)"

Start-Sleep 5;

} while (($Job_TFOState.state -eq "InProgress") -or ($Job_TFOState.state -eq "NotStarted"))

if($Job_TFOState.state -eq "Failed"){

Write-host("The test failover job failed. Script terminating.")

Exit

}else {

Read-Host -Prompt "Test failover has completed. Please check ASR Portal, test VMs and press enter to perform cleanup..."

#Start test failover cleanup of recovery plan

$Job_TFOCleanup = Start-AzRecoveryServicesAsrTestFailoverCleanupJob -RecoveryPlan $RecoveryPlan -Comment "Testing Completed"

do {

$Job_TFOCleanupState = Get-AzRecoveryServicesAsrJob -Job $Job_TFOCleanup | Select-Object State

Clear-Host

Write-Host "======== Monitoring Cleanup ========"

Write-Host "Status will refresh every 5 seconds."

try {

    }

catch {

Write-Host -ForegroundColor Red "ERROR - Unable to get status of cleanup job"

Write-Host -ForegroundColor Red "ERROR - " + $_

        log "ERROR" "Unable to get status of cleanup job"

        log "ERROR" $_

exit

    }

Write-Host "Cleanup status for $($Job_TFO.TargetObjectName) is $($Job_TFOCleanupState.state)"

Start-Sleep 5;

} while (($Job_TFOCleanupState.state -eq "InProgress") -or ($Job_TFOCleanupState.state -eq "NotStarted"))

Write-Host "Test failover cleanup completed."

}

image

The following are screenshots of the PowerShell script output:

image

I hope this will help anyone out there who may be looking for a PowerShell script to automate ASR failover process.

One of the additions I wanted to add to this script was to list the Status VMs in the recovery group after the test failover has completed but I could not find a way to list the VMs that only belong to the recovery group. The cmdlets below lists all of the VMs that are protected but combing through the properties does not appear to contain any reference to what recovery plans they belong to. Please feel free to comment if you happen to know the solution.

$PrimaryFabric = Get-AzRecoveryServicesAsrFabric -FriendlyName svr-asr-01

#svr-asr-01 represents Configuration Servers

$PrimaryProtContainer = Get-AzRecoveryServicesAsrProtectionContainer -Fabric $PrimaryFabric

$ReplicationProtectedItem = Get-AzRecoveryServicesAsrReplicationProtectedItem -ProtectionContainer $PrimaryProtContainer

----------Update July 31, 2021---------

After reviewing some of my old notes, I managed to find another version of the PowerShell script that performed test failover for two plans and included steps to shutdown a VM, remove VNet peering between production and DR regions before the test failover, then recreate them afterwards. The following is a copy of the script:

Connect-AzAccount

Set-AzContext -SubscriptionId "53ea69af-xxx-xxxx-a020-xxxxea02f8b"

#Shutdown DC2

Write-Host "Shutting down DC2 VM in DR"

$DRDCName = "DC2"

$DRDCRG = "Canada-East-Prod"

Stop-AzVM -ResourceGroupName $DRDCRG -Name $DRDCName -force

#Declare variables for DR production VNet

$DRVNetName = "vnet-prod-canadaeast"

$DRVnetRG = "Canada-East-Prod"

$DRVNetPeerName = "DR-to-Prod"

$DRVNetObj = Get-AzVirtualNetwork -Name $DRVNetName

$DRVNetID = $DRVNetObj.ID

#Declare variables for Production VNet

$ProdVNetName = "Contoso-Prod-vnet"

$ProdVnetRG = "Contoso-Prod"

$ProdVNetPeerName = "Prod-to-DR"

$ProdVNetObj = Get-AzVirtualNetwork -Name $ProdVNetName

$ProdVNetID = $ProdVNetObj.ID

# Remove the DR VNet's peering to production

Write-Host "Removing VNet peering between Production and DR environment"

Remove-AzVirtualNetworkPeering -Name $DRVNetPeerName -VirtualNetworkName $DRVNetName -ResourceGroupName $DRVnetRG -force

Remove-AzVirtualNetworkPeering -Name $ProdVNetPeerName -VirtualNetworkName $ProdVNetName -ResourceGroupName $ProdVnetRG -force

#Failover Test for Domain Controller BREAZDC2

$RSVaultName = "rsv-asr-canada-east"

$ASRRecoveryPlanName = "Domain-Controller"

$TestFailoverVNetName = "vnet-prod-canadaeast"

$vault = Get-AzRecoveryServicesVault -Name $RSVaultName

Set-AzRecoveryServicesAsrVaultContext -Vault $vault

$RecoveryPlan = Get-AzRecoveryServicesAsrRecoveryPlan -FriendlyName $ASRRecoveryPlanName

$TFOVnet = Get-AzVirtualNetwork -Name $TestFailoverVNetName

$TFONetwork= $TFOVnet.Id

$Job_TFO = Start-AzRecoveryServicesAsrTestFailoverJob -RecoveryPlan $RecoveryPlan -Direction PrimaryToRecovery -AzureVMNetworkId $TFONetwork

do {

$Job_TFOState = Get-AzRecoveryServicesAsrJob -Job $Job_TFO | Select-Object State

Clear-Host

Write-Host "======== Monitoring Failover ========"

Write-Host "Status will refresh every 5 seconds."

try {

    }

catch {

Write-Host -ForegroundColor Red "ERROR - Unable to get status of Failover job"

Write-Host -ForegroundColor Red "ERROR - " + $_

        log "ERROR" "Unable to get status of Failover job"

        log "ERROR" $_

exit

    }

Write-Host "Failover status for $($Job_TFO.TargetObjectName) is $($Job_TFOState.state)"

Start-Sleep 5;

} while (($Job_TFOState.state -eq "InProgress") -or ($Job_TFOState.state -eq "NotStarted"))

if($Job_TFOState.state -eq "Failed"){

Write-host("The test failover job failed. Script terminating.")

Exit

}else {

#Failover Test for Remaining Servers

$ASRRecoveryPlanName = "DR-Servers"

$RecoveryPlan = Get-AzRecoveryServicesAsrRecoveryPlan -FriendlyName $ASRRecoveryPlanName

$Job_TFO = Start-AzRecoveryServicesAsrTestFailoverJob -RecoveryPlan $RecoveryPlan -Direction PrimaryToRecovery -AzureVMNetworkId $TFONetwork

do {

$Job_TFOState = Get-AzRecoveryServicesAsrJob -Job $Job_TFO | Select-Object State

Clear-Host

Write-Host "======== Monitoring Failover ========"

Write-Host "Status will refresh every 5 seconds."

try {

        }

catch {

Write-Host -ForegroundColor Red "ERROR - Unable to get status of Failover job"

Write-Host -ForegroundColor Red "ERROR - " + $_

            log "ERROR" "Unable to get status of Failover job"

            log "ERROR" $_

exit

        }

Write-Host "Failover status for $($Job_TFO.TargetObjectName) is $($Job_TFOState.state)"

Start-Sleep 5;

    } while (($Job_TFOState.state -eq "InProgress") -or ($Job_TFOState.state -eq "NotStarted"))

if($Job_TFOState.state -eq "Failed"){

Write-host("The test failover job failed. Script terminating.")

Exit

    }else {

Read-Host -Prompt "Test failover has completed. Please check ASR Portal, test VMs and press enter to perform cleanup..."

$Job_TFOCleanup = Start-AzRecoveryServicesAsrTestFailoverCleanupJob -RecoveryPlan $RecoveryPlan -Comment "Testing Completed"

do {

$Job_TFOCleanupState = Get-AzRecoveryServicesAsrJob -Job $Job_TFOCleanup | Select-Object State

Clear-Host

Write-Host "======== Monitoring Cleanup ========"

Write-Host "Status will refresh every 5 seconds."

try {

    }

catch {

Write-Host -ForegroundColor Red "ERROR - Unable to get status of cleanup job"

Write-Host -ForegroundColor Red "ERROR - " + $_

        log "ERROR" "Unable to get status of cleanup job"

        log "ERROR" $_

exit

    }

Write-Host "Cleanup status for $($Job_TFO.TargetObjectName) is $($Job_TFOCleanupState.state)"

Start-Sleep 5;

} while (($Job_TFOCleanupState.state -eq "InProgress") -or ($Job_TFOCleanupState.state -eq "NotStarted"))

$ASRRecoveryPlanName = "Domain-Controller"

$RecoveryPlan = Get-AzRecoveryServicesAsrRecoveryPlan -FriendlyName $ASRRecoveryPlanName

$Job_TFOCleanup = Start-AzRecoveryServicesAsrTestFailoverCleanupJob -RecoveryPlan $RecoveryPlan -Comment "Testing Completed"

do {

$Job_TFOCleanupState = Get-AzRecoveryServicesAsrJob -Job $Job_TFOCleanup | Select-Object State

Clear-Host

Write-Host "======== Monitoring Cleanup ========"

Write-Host "Status will refresh every 5 seconds."

try {

    }

catch {

Write-Host -ForegroundColor Red "ERROR - Unable to get status of cleanup job"

Write-Host -ForegroundColor Red "ERROR - " + $_

        log "ERROR" "Unable to get status of cleanup job"

        log "ERROR" $_

exit

    }

Write-Host "Cleanup status for $($ASRRecoveryPlanName) is $($Job_TFOCleanupState.state)"

Start-Sleep 5;

} while (($Job_TFOCleanupState.state -eq "InProgress") -or ($Job_TFOCleanupState.state -eq "NotStarted"))

Write-Host "Test failover cleanup completed."

}

}

#Create the DR VNet's peering to production

Write-Host "Recreating VNet peering between Production and DR environment after failover testing"

Add-AzVirtualNetworkPeering -Name $DRVNetPeerName -VirtualNetwork $DRVNetObj -RemoteVirtualNetworkId $ProdVNetID -AllowForwardedTraffic

Add-AzVirtualNetworkPeering -Name $ProdVNetPeerName -VirtualNetwork $ProdVNetObj -RemoteVirtualNetworkId $DRVNetID -AllowForwardedTraffic

#Power On DC2

Write-Host "Powering on DC2 VM in DR after testing"

Start-AzVM -ResourceGroupName $DRDCRG -Name $DRDCName

Monday, July 26, 2021

Azure site Recovery replication for Windows 2008 R2 server fails with: "Installation of mobility agent has failed as SHA-2 code signing is not supported on the current Microsoft Windows Server 2008 R2 Standard OS version"

As much as Windows Server 2008 R2 has come to end of support, I still periodically come across them when working with clients and one of the common scenarios I’ve had to deal with is attempting to replicate them from an on-premise network to Microsoft Azure with Azure Site Recovery. Below is an issue that I’ve seen quite a few times so I’d like to write this quick blog post to describe the problem and the steps to remediate.

Problem

You’re trying to replicate an on-premise Windows 2008 R2 server that has Service Pack 1 installed to Azure with Azure Site Recovery:

image

However, the installation of the mobility service fails:

image

The specific Error Details for the server are as follow:

----------------------------------------------------------------------------------------------------------------------------

Error Details

Installing Mobility Service and preparing target

·

· Error ID

78007

· Error Message

The requested operation did not complete.

· Provider error

Provider error code: 95560 Provider error message: Installation of mobility agent has failed as SHA-2 code signing is not supported on the current Microsoft Windows Server 2008 R2 Standard OS version. Provider error possible causes: For successful installation, mobility service requires SHA-2 support as SHA-1 is deprecated from September 2019. Provider error recommended action: Update your Microsoft Windows Server 2008 R2 Standard operating system with the following KB articles and then retry the operation. Servicing stack update (SSU) https://support.microsoft.com/en-us/help/4490628 SHA-2 update https://support.microsoft.com/en-us/help/4474419/sha-2-code-signing-support-update Learn more (https://aka.ms/asr-os-support)

· Possible causes

Check the provider error for more details.

· Recommendation

Resolve the issue as recommended in the provider error details.

· Related links

o https://support.microsoft.com/en-us/help/4490628

o https://support.microsoft.com/en-us/help/4474419/sha-2-code-signing-support-update

o https://aka.ms/asr-os-support

· First Seen At

7/22/2021, 9:28:00 PM

----------------------------------------------------------------------------------------------------------------------------

image

The Error Details provides the suggestion to download and install KB4490628 but when you attempt to do so, the installation wizard indicates the update is already installed on the server:

https://support.microsoft.com/en-us/help/4490628

AMD64-all-windows6.1-kb4490628-x64_d3de52d6987f7c8bdc2c015dca69eac96047c76e.msu

image

Solution

I’ve come across the following 2 scenarios for this:

  1. The update KB4490628 indicated above has been installed
  2. The update KB4490628 indicated above has not been installed

Regardless of which of the above scenario applies to the problematic server, the first step is to download the following KB4474419 update and install it:

2019-09 Security Update for Windows Server 2008 R2 for x64-based Systems (KB4474419)

AMD64-all-windows6.1-kb4474419-v3-x64_b5614c6cea5cb4e198717789633dca16308ef79c.msu

image

image

Once the update has been installed and the server has been restarted, proceed to try installing the suggested KB. If it had already been installed then it will not continue but if it hasn’t, it will proceed, complete and not require a restart.

With the above completed, the Microsoft Azure Site Recovery Mobility Service/Master Target Server should now install successfully and the Enable replication job should complete successfully:

image

With the required updates installed, the deployment of the Mobility Service agent should succeed and the replication job should complete:

image

Hope this helps anyone who may be encountering this issue.