Active Directory SnapshotDecember 16, 2013
Active Directory (AD) is the base of most enterprise level infrastructures and has been for some time. We have become accustomed to seeing this structure and depending on it. But AD has been a thorn in our side since virtualization has become popular due to the inability to take snapshots. This is no longer the case if your shop is running Windows Server 2012 with Active Directory.
With the release of Active Directory 2012, Microsoft has added a new object called the VM GenerationID that allows us to snapshot AD Servers.
Why was Restoring from Snapshots of AD Servers bad?
Active Directory keeps track of what data has been replicated to fellow Domain Controllers by tagging changes with a Update Sequence Numner (USN). When a restore from snapshot occurs, this USN gets reset.
Look at what happens when an AD Server is restored from snapshot prior to Server 2012?
In the above example the server on the left is replicating changes to the server on the right as part of normal AD replication operations. During step 1. updates are replicated to the second DC and the USN is updated incrementally. Step 2, the same thing happens. Now between step 2 and 3 the server is restored from snapshot, meaning that we’ve rolled back the USN to 100 again. The problem is that the DC on the right is still only waiting for USNs that are greater than 200 meaning it will ignore any changes from the DC on the left leaving us with a whole mess of problems.
How did Server 2012 Fix these USN Rollbacks?
In Server 2012 Microsoft added a new identifier called the VM-Generation ID. This is an identifier used specifically to stop the USN Rollback from occurring. The VM-Generation ID is exposed to the virtual machine for the hypervisor to read. When creating a new snapshot, the hypervisor increments this VM-Generation ID, and that is where the magic happens.
Lets look at the new example. Everything is the same in the first two steps except there is a VMGenerationID. When the Active Directory snapshot restore happens, the VMGenerationID is compared to the old VMGenerationID and if they don’t match, the virtual machine basically performs a Non-Authoritative Restore. During this type of restore, the Active Directory Partition only receives copies of AD changes until it’s back “in sync” with the other DCs. Once that’s finished it can participate is sending updates again.
Obviously, my explanation of this process s very rudimentary, so if you’d like a more detailed description please check out this technet article for a more detailed explanation.
There are only certain Hypervisors that will work with this new VM-GenerationID, Hyper-V 3.0 and vSphere 5.0 U2 or higher should be fine. If you want to see for sure, check the event log on your Domain Controller to see if you see EventID 2168 listed.
Just to prove that this works, I ran it in my home lab. I wanted to see how easy this was to do and if I noticed any hiccups with this process. After all, Active Directory is a pretty important thing to keep from corrupting.
I built an AD Server and made sure it was replicated correctly. Then I created a snapshot of the server.
Once the snapshot was done, I created a new user in Active Directory on this server. I waited a bit and then performed a restore on the server. I noticed several events in the event viewer after the restore.
We see a warning message about the GenerationID change being detected. That’s a good thing in our case!
AD has realized that a non-authoritative restore must occur. Also good news for us. This means it’s fixing Active Directory replication for us.
I’ve seen the horror stories about USN Rollback on Active Directory after snapshotting and restoring an Active Directory Server, and would hesitate to do this in a live environment. Even though it seems like a scary thing to do, it does work and can be trusted in your environment. Just make sure it’s supported on your servers first!