Friday, August 19, 2016

PernixData Backup Best Practices - 3. VADP Hot-Add

The third backup method in this series is vSphere Storage APIs for Data Protection (VADP). The Virtual Disk Development Kit (VDDK) is used with VADP to develop backup and restore software. VDDK is an open API and SDK and includes the following components:
  • The virtual disk library which is a set of C functions call to manipulate VMDK files
  • The disk mount library which is a set of C functions call to remote mount VMDK files
  • Sample C++ codes
  • VDDK utilities which include disk mount and the virtual disk manager
  • Documentation
The VADP framework was introduced in vSphere 4.0 to enable backup products to do centralized, efficient, off-host, LAN-free backups of vSphere virtual machines (VMs). VADP takes advantage of the snapshot capabilities of VMware VMFS or NFS without any downtime for the VM itself. VADP based backups also leverage advanced virtualized functionality such as compression between the backup host and the VADP Proxy, differential and incremental backups and more importantly  the IO performance of the VM is not unduly  affected because all backup I/O flows via the VADP Proxy. The typical backup of a VADP backup using Hot-Add is shown in the following figure.
Screen Shot 2016-06-06 at 17.18.30.png
Figure 1: VADP Hot-Add Functionality
  1. Proxy triggers backup (VDDK API initiates a session with the ESXi/vCenter containing the target VM).
  2. Guest creates Snapshot.
  3. This creates a Delta Disk.
  4. The Hot-Add process creates another Redo-Log of the previously created Delta Disk.
  5. Hot-Add of the previously created Snapshot as Independent Non-Persistent Disk.
  6. Proxy starts backup and optionally sends the data to another backup server
  7. When Backup is finished Proxy removes Delta Disk and triggers snapshot removal
  8. Guest removes Snapshot
  9. Delta Disk is consolidated into Base Disk
With PernixData FVP present, the process works a little bit different while FVP fully supports VADP based backups. It is essential that the VADP Proxy itself is running on a host where the FVP Host Extension has been installed and that the VM needs to be tagged as a VADP proxy in FVP under: Configuration → Advanced → VADP → add the VADP proxy.

Without the tagging of the VADP proxy as such, it could happen that the backup data are inconsistent and not useable!


The following figure illustrates how FVP behaves when the VADP proxy as well as the to backup VM is on the same host.
Figure 2: VADP Host-Add, Guest VM & VMware Proxy are on the same Host
  1. When a Snapshot of the source VMs is triggered, FVP automatically initiates a WB → WT transition.
  2. As FVP intercepts I/O and not aware of the type of this I/O (it is detected as an LBA and offset). To ensure read consistent data as well, not caching data of the proxy-vm (cache pollution) the VM is tagged with the VADP policy.
  3. Read I/O gets triggered Bypass implementation within FVP.
  4. Because of the Context the I/O comes in we always read valid data. If the data we read has not been destaged to the Storage System we consult the staging area and wait for it to get destaged before to proceed.
The next figure illustrates how FVP works when the VADP proxy and the  VM to be backed up are on different hosts. The difference here is that FVP running on a different host interprets certain file lockings differently. A lock of a file always happens on a host and on not the VM itself. In the case of the VADP proxy and the VM being on different hosts, FVP uses a Safe Access to trigger an RPC request to the host where the VM is powered on. If then FVP receives  the notification that all data from the FVP layer have been destaged, the redo file can get mounted to the VADP Proxy. The following figure shows how VADP backups works in a situation where the VADP proxy and the VM are on different hosts.
Figure 3: VADP Hot-Add, Guest VM & VMware Proxy are on separate Hosts
  1. As FVP intercepts I/O and not classifying where this I/O belongs to (only detected as a LBA and offset)., to ensure read-consistent data, as well as not caching data of the proxy-vm (cache pollution), the VM is tagged with the VADP policy.
  2. Because data is now accessed from a remote host, there is a need to obtain the lock of the redo file. Because the disk lock information can be read, the host  upon which this file belongs to is identified.
  3. A Remote Procedure Call (RPC) is triggered via a Safe Access to the peer host running the VM. If there is data to destage, FVP waits until that has been finished.
  4. The redo file is created and mounted to the VADP proxy.
  5. Backup through the VADP proxy proceeds.

No comments:

Post a Comment