Datacenters are going smaller compare to datacenters in past few years because of virtualization revolution. Organizations are deploying their services on virtual servers instead of physical servers and most companies over the world have large virtual farms. But some traditional challenges are still exists such as backup and disaster recovery.
Virtual machine deployment and maintenance is easier than physical servers but administrators have to take backup from machines, files or any important data same as before, nothing changed! Also organizations still needs to recovery site for disaster recovery in other geographical location, all machines should be replicate between the primary and recovery site. Backup solutions are little different in virtual infrastructures with physical infrastructures but backup solution has also backward compatibility in virtualization platforms, means that the backup solutions covers both virtual and physical machines now.
The new backup and recovery solutions still needs to storage space for storing backup files, bandwidth for transfer and computing resources for processing backup jobs. Any backup solution has some best practices and administrators should consider the best practices to achieve best results. Know VM backup best practices are important topics for planning and implementing backup solutions.
Join me to review some of best practices for backup solutions in virtual infrastructures.
Define Backup Strategy
Defining good strategy is first and important step of backup and recovery solutions. Administrators should know:
- Should take backup from all virtual machines?
- No. All virtual machines are not critical for us. Virtual machines should divided to groups and define backup plans according to those groups and their priorities.
- On-Site or Off-Site?
- RTO and RPO: Organizations should define their RTO and RPO on their business continuity planning.
Choose Best Backup Method
Backup method will define resource usage for backup solutions, so choosing best backup method is very important. There is two main backup methods:
- Full Backup
- Incremental Backup
Actually, full backup means taking backup from all things. When full backup chosen as backup method, all virtual machine files will be copied to another location. All files including virtual machine configuration files, logs, hard disk file and other files.
Full backup is exactly similar to original VM even it includes virtual machine configuration files and it’s not depended to anything unlike snapshots, so it can restore as a separated machine, when virtual machine is corrupted or administrator wants to restore machine to a specific state.
This method is suitable for virtual machines with high change rate.
Full backup has some disadvantages that administrators should consider those:
- Even backup jobs configure with maximum compression, full backup a large file always and needs to more storage space for storing the file.
- Because of large size, full backup processing takes more time compare to other methods.
- Transferring full backup files needs more time and larger bandwidth because of full backup size.
Actually, incremental backup is depended to a full backup. It means, any incremental backup job has at least one full backup and more than one incremental backup. Always, first piece of any incremental backup is a full backup and other pieces are files that contains changes between full backup file and original virtual machines.
- Taking incremental backup is usually faster than full backup.
- Incremental backup is very lighter than full backup.
- Because of lower size, administrators can configure backup jobs with more restore points ,even one day, one hour or minutes.
- If one pieces of incremental backup is deleted or corrupted, one state of machine is missed and administrator can recover machine to a state before the missed state.
- Incremental backup method is not suitable for virtual machines with high change rates because some virtual machines reading and writing on disk always and processing the changes takes more time and needs storage space even equal to full backup.
- Restoring incremental backup takes more time compare to full backup because all changes should be combined with first state that the state is a full backup before restoring machine.
How Many Restore Points?
Number of restore points is actually depends to organizations policies and importance of data. It’s actually has high impact on backup system resource usage, backup window time and even virtual infrastructure.
Some virtual machines don’t need to taking backup every hour or day but some of virtual machines actually hosting critical services and important data, so needs to more restore point to have backup from newest state always. For those virtual machines, replication is very better solution.
Compression, Deduplication, Yes or No?
Actually, any organization should provide at least 50% or 75% percent of main storage space for storing backup file as a separated storage system.
So compression and data deduplication would be good solution to saving storage space and bandwidth.
By saving bandwidth, more parallel jobs can be scheduled and more backups will be stored on storage at same time.
Also in addition of backup software compression and deduplication, administrators can use storage hardware deduplication and make backup files smaller or using Windows Deduplcation feature would be very useful for reduce backup size on disk.
Just thinking about thousand machines with same guest OS, deduplicaton can help to storing unique blocks and when duplicated blocks are many, deduplication will be reduced backup file size much more.
Compression and deduplication has low impact on computing and disk resources on both virtual and backup infrastructure, of course low impact when administrators choosing normal compression ratio.
More compression ratio has more impact on performance of backup and restore and also has more corruption risk.
Choose Best Transfer Method
There is some way to transfer backup data from virtual infrastructure to backup repository:
This method is most popular method and most administrators using that. Because there is no need to any additional configurations. Just backup system must has access to virtual infrastructure network. At least, 10Gb link is recommended because backup transferring needs high bandwidth.
It has impact on virtual infrastructure network performance and if there is any virtual machine which it’s network latency sensitive may be impacted.
Another method is transferring backup data via SAN fabric. Backup system must has access to SAN storage same as virtual infrastructure. It has very low impact on virtual infrastructure and very very faster than other method.
All LUNs should be presented to backup servers too, so there is a risk. Administrators should consider that backup server OS has access to all LUNs and the LUNs should not reinitialize.
Backup solutions are more flexible for virtual infrastructure because virtual infrastructures are more flexible compare to physical infrastructures. It’s possible that backup server acts as a virtual proxy to processing and transferring backup data by using virtual infrastructure API.
As virtual proxies are virtual machines, so this method has more impact on virtual infra performance because virtual proxies use backup infra resources. Also there is need to some conditions to deploy virtual proxies.
Virtual proxies are faster than network transferring and backup system can do more parallel jobs by virtual proxies.
May be there is other backup method, actually this is depended to backup solution. The above methods are most popular methods in virtual backup solutions.
File Level Backup, Object Level Backup or Not?!
Actually, some times there is no need to take backup from guest OS files and other files are very important. So backup jobs can configure to transferring files from within virtual machine to backup repository.
It would help organizations to keep their backup files smaller because just important data is transferred and backup files can be restored in different location or different virtual machine.
There is a good example is:
Taking backup from AD objects and administrators even be able to restore single object from many objects.
Resorting file or object is very faster than restoring whole virtual machine files and also there is no need to down time.
Of course, there is some conditions and also most native backup solutions can’t do it!
A Time Window defines a time interval in which something can occur. In backup solutions, Time Window is very very very important element. Why?
Answer so easy, because processing data and transferring data to backup repository has impact on virtual infra performance.
So administrators should consider about configure backup jobs time window and it should be done intelligently.
Backup jobs should be done at end of Time Window, so choosing best backup method, considering compression, data deduplication and choosing best transfer method will help to achieve that.
If Time Window has defined for run jobs during working hours, administrators consider about storage latency, slowness on computing resources. Best Time Window, actually defined for run backup jobs after working hours or peak time.
Native Solutions or Third-Party
Most virtualization platforms have native backup solutions but also there is many third-party solutions for backup virtual machines.
Native solutions are mostly cheaper than third-party solutions but third-party solutions deliver more features.
Third-party solutions can manage multiple virtualization platforms such as VMware vSphere and Microsoft Hyper-V.
Native solutions has less features such as file level backup, object level backup, backup copy, backup methods and others.
Your business is your data and keeping data safe means keeping your business safe. Actually, using third-party solutions will help you to have better data protection but choosing native or third-party is depended to your organization and company strategies.
If you are interesting to read more about third-party backup solutions read my posts about two of best backup solutions:
Also you can read more post about backup and replication on this link: Backup & Replication
As you read the post, VM backup best practices are depended to many elements and also your budget.
Backup has cost, the cost is even more than main infrastructure but companies have nothing without their data and data protection is important equal to generating new data.
Also generating new data is much expensive compare to protecting your current data.
Define best strategy for your backup infra and doing optimization on backup system always.