SAN Snapshots vs VMware Snapshots
February 27, 2012I found people have a hard time understanding that a SAN Snapshot and a VMware snapshot are fundamentally different. I think because unless you’re a storage administrator, you’re probably not dealing a whole lot with snaps to begin with. VMware has made it more commonplace for System Administrators to deal with snapshot technology.
SAN Snapshots
Lets first look at how traditional SANs take snapshots.
To start we have 6 blocks being used. The file system has marked blocks which blocks are being used.
Now we create a SAN Snapshot and modify part of the data. Notice how the data changes.
As you can see, block 4 was not overwritten, but the new data was written to block 11 and the file system made note of the change. Notice the data shows 1,2,3,11,5,6 where 11 has replaced 4.
The snapshot however still has pointers to the original data which can be mounted, backed up, modified etc. depending upon your storage system.
VMware Snapshots
VMware snapshots are much different. When you have a virtual disk there is an associated .vmdk file. When you take a snapshot, a new .vmdk will be created to write any of the changes. This DISKNAME-00001.vmdk will then store all of the deltas. VMware is using Copy on Write to then take any changes to the original DISKNAME.vmdk and write them to the DISKNAME-00001.vmdk.
VMware snapshots should be used carefully because they can quickly fill up the datastore they belong too. For instance a 40 Gb disk could potentially have a 40Gb snapshot file. There could also be another 40 Gb snapshot of that file so with 2 snapshots we could have tripled the amount of space used.
Also, when you have finished using a snapshot and “delete” the snapshot, you actually are committing those deltas to the original vmdk. This will take resources and could affect your VM’s performance. The larger the snapshot, the longer it will take to commit and the higher the likelihood that you’ll notice a performance issue.
VMware snapshots work best for short term operations such as applying a new patch. If you finish installing your patch and are satisfied with the results, get rid of your snapshot. Don’t leave them sitting around taking up space and causing you a performance issue later on.
SNAPSHOTS ARE NOT BACKUPS!
My final point is this: SNAPSHOTS ARE NOT BACKUPS. If you lose the disk array that the snapshots are on, you’ve lost your data too. Make sure your backups are stored in another location.
More information on snapshots
Thank you for your amazing blog. Really worth reading. One question on this post.
In Storage snapshot, when you take first snapshot blocks from 1-to-12 are copied. When you take second snapshot, data on block-04 is changed. That changed data is copied to block 11. So second snapshot includes blocks from 1-to-12 but 4 has pointer to 11 block. So what happens when you restore this snapshot or take back of this snapshot.
First I want to mention that blocks 1-12 are not copied. The file system just points to the location of where the blocks are at.
Second, block 4 doesn’t point to block 11. The data in block 4 remains the same, the snapshot will contain the list of pointers to where the current data resides. So instead of the file system pointing at block 4, it instead points to block 11.
If you restore the second snapshot you would see the data in blocks 1,2,3,11,5,6.
I hope this answers your question.
Thanks for reading!
I’m confused. What you describe with VMware snapshots sounds like Redirect-on-Write.. not COW. If the base vmdk is locked and all new data is written to the snapshot delta vmdk wouldn’t that be a Redirect-on-Write?
VMware uses the Copy-On-Write method for their snapshots. I see your point but have found this over and over again in the VMware documentation. http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1015180
and
http://pubs.vmware.com/vsphere-51/index.jsp?topic=%2Fcom.vmware.vsphere.vm_admin.doc%2FGUID-38F4D574-ADE7-4B80-AEAB-7EC502A379F4.html
Why does deleting VMware snapshot takes so much time?
In case VMware uses COPY-ON-Write, a commitment from delta to base would be unnecessary?
I’m super confsued.
All of the changes to the original vmdk file are stored in the delta disk. Once a snapshot is committed, the changes have to be written to the original disks which can take some time.
So it’s not a COW process, right? thanks for your reply.
It specifically says in the KB article that the snapshot disk employs a copy on write. http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1015180
It might seem a bit counter-intuitive but the metadata is copied to the sparse disk and then updated.
Check out this blog post from Nimble Storage http://www.nimblestorage.com/blog/technology/o-snapshot-what-art-thou/ which goes into more detail.
VMWare’s use of the term “Copy on Write” is the generic meaning according to standard computer science terminology. There are two ways to do a copy (of memory, disk, etc., whatever).
1. Full Copy – all data is copied at once.
2. Copy on Write – You make a “virtual copy” by copying only the meta-data, and pretend you have two copied to whatever is using the implementation. When client/user tries to write to the copy, then you have to write that update data somewhere and make sure that the metadata points to the right place to keep the abstraction alive.
When implementing CoW, you can choose to update the “base” or update the “delta” in your implementation. Most SAN implementations write updated data to the “base” (original), but intercept those write and first copy the old data that was in the base to the Delta file. In this way, writing to the base slows down, but deleting the delta file is continually updated to contain the data of the snapshot at the time it was taken. This means that deleting the snapshot is instantaneous. Reverting to a snapshot means copying all of the data in the delta file over the original file, and incurs a performance penalty due to all of the copying. This is great for back-up use, since you usually want to delete the snapshot after the back-up is done anyway.
VMWare’s implementation prioritizes ongoing write performance inside the VM and and rollback performance instead. The “base” is frozen at the time the snapshot is made and becomes read-only. When any changes are made, they are made to the delta file directly. Reverting to a snap-shot is as easy as deleting the delta file and treating the original disk file as read-write again, so it is relatively instant. This is great when you want to use a VM for installing something for test purposes, and then revert it to the previous state quickly. On the other hand, if you want to delete a snapshot (which really means committing the change), then the system over-writes the base data using the data from the snapshot delta file and then deletes the delta file. Since the delta file holds new writes, there is no need to intercept writes and back up the original data as there is in the typical SAN implementation, so writes will be faster.
So the difference is what gets updated when writes occur after the snapshot is taken. Either way is “copy on write” in the sense that the data is not copied wholesale when a snapshot is taken, but the snapshot diverges from the original only when updated.
@Noa Shiruba
So how is VMware’s CoW implementation any different to Redirect-On-Write?