Let’s Talk Backup & Restore – A Case Study | 2Secure Corp

Table of Contents

YouTube video

In this episode of The Cybersecurity Insider, join the host, Yigal Behar as he examines real-world scenarios and backup and restore strategies. 

He shares case studies that highlight the importance of Disaster Recovery Planning (DRP) and Business Continuity Planning (BCP) so that you can learn the key lessons and strategies to help you safeguard your organization from similar disasters.

Disaster Recovery & Business Continuity Planning

Yigal starts talking about the importance of disaster recovery and business continuity planning. “We absolutely need to have a plan in place, a solid strategy, because if we jump in without thinking things through, we’re in for a world of trouble,” he warns. He stresses that this requires some serious critical thinking to anticipate all the possible scenarios and their consequences.

To illustrate his point, Yigal then shares a real-life case study from a few months ago. He shows us a picture sent to him by an IT manager, with the findings: the client has lost all virtual machines. “The business is down with no ability to do anything.” Yigal pauses, letting the gravity of the situation sink in.

Examining a Client’s Vulnerable IT Infrastructure

Yigal then discusses the client’s IT setup, describing their “client footprint.” They have two Microsoft Hyper-V hosts. Yigal shares his perspective, explaining that they’re phasing out Microsoft in favor of VMware. “My personal opinion…I’m not such a fan of Microsoft,” he admits. “I think that the virtualization, Hyper-V, doesn’t have all the features that VMware provides.”

He points at features like vMotion, which allows for seamless migration of virtual machines between hosts for maintenance or recovery purposes. Yigal highlights that this client is currently lacking that capability.

Furthermore, Yigal claims that the client’s Windows versions are outdated and in need of upgrades, as Microsoft has ended their maintenance. They have about 12 virtual machines running various critical applications like SQL Server, Access databases, domain controllers, and QuickBooks.

Yigal emphasizes that the client’s operations extend beyond their on-premise setup, as they utilize Salesforce for order processing and product management. “It’s not only their on-prem, but also they have cloud applications,” he notes.

A Costly Cloud Mishap & a Single Point of Failure

Interestingly, Yigal reveals that this client had initially planned to migrate everything to the Azure cloud but ultimately decided against it due to the exorbitant cost. “Thousands of dollars per month just to have a few virtual machines, a SQL database, and domain controllers was really a lot in terms of expense,” he remarks.

Yigal then brings our attention to a critical vulnerability in their infrastructure: a single storage device. He shows us a picture of the device connected to the hosts, emphasizing the lack of redundancy. “This is a big problem because if you need to recover, you can’t recover anywhere,” Yigal warns. “If you have a problem with that hardware…you’re done, you’re toast.”

To make matters worse, their local backup relies on USB storage. Yigal shakes his head, “You can’t use USB as a backup…you can’t restore the entire virtual machines if you need to restore terabytes of virtual machines. It’s going to take you weeks and weeks and weeks until you are able to have the data.”

Yigal then moves on to discuss the impracticality of relying solely on USB for storage restoration. “Let’s say you have the storage device and you need to restore it from USB,” he begins, “I don’t need to confuse you speaking with this.” He acknowledges that while the client did have cloud backups for all their virtual machines, which ultimately saved them, it’s crucial not to solely rely on the cloud. “It’s better to have another place where you backup your information, not only on-premises,” he advises.

What Do We Need to Consider?

Transitioning to the specifics of the case, Yigal emphasizes the importance of understanding the client’s critical applications and their locations, whether on-premise or in the cloud. He paints a scenario where a cloud-based order system becomes non-functional due to on-premises systems being down, underscoring the interconnectedness of operations.

Yigal explains that various factors can cause system downtime, including hardware failures, ransomware attacks, or other cyber threats. He insists on considering downtime tolerance when devising a backup and recovery plan.

He recounts a recent conversation with a client who was uncertain about their backup procedures. Upon investigation, it turned out they were only backing up to the virtual machine itself, without capturing critical data or images.

How Much Data Can We Lose? 

Yigal cautions against having no backups at all, emphasizing that both on-premises and cloud backups are essential. He explains that ransomware can encrypt not only data but also backups, making it necessary to have multiple layers of protection.

The question of how much data loss is acceptable is raised. Yigal acknowledges that this varies depending on the business and its criticality. He mentions that some clients require hourly backups due to the sensitive nature of their data. Yigal leaves the viewers with a thought-provoking question: “How long can we lose the data?”

Thorough Planning & Redundancy 

Yigal circles back to his earlier point about the importance of thorough planning. He recalls the initial conversation about cloud backups for this client, where the focus was primarily on having a cloud backup in addition to the on-premises one. However, he admits, “When we did the cloud backup, we didn’t think about the restore…how long can it take. We didn’t do all this thinking. Nobody did any thinking.”

He takes responsibility for this oversight, expressing regret that as a consultant, they didn’t ask all the necessary questions upfront. “This is very, very bad from our perspective,” Yigal candidly admits. Their goal is to help others avoid similar mistakes, which is why he’s sharing this case study.

Yigal then shifts the focus to the requirements for a successful backup and recovery plan. He urges the audience to consider the time it would take to restore files versus entire virtual machines, which could involve gigabytes or terabytes of data. He emphasizes that recovery time is paramount, even with local backups or cloud backups.

Yigal passionately warns against using USB drives or backup tapes for storing critical data. “I hope that you are not using backup tapes,” he exclaims, “because we hear [from] customers that [still] have backups using backup tapes. This is no no no, you can’t use backup tapes!” He emphasizes that this technology is outdated and no longer considered secure.

While acknowledging the current trend of migrating data to the cloud, Yigal cautions against putting all eggs in one basket. He reminds the audience that if their internet connection goes down, their access to cloud-based data will be compromised. He urges them to think critically about potential vulnerabilities in their network infrastructure.

“If my internet is down, or my firewall is down, or my router is down, or my switch got a malfunction, nothing is working there,” Yigal explains. “You need to replace it. Do you have spare parts in the house?” He leaves the audience with a question to ponder, prompting them to consider their preparedness for such scenarios.

Functioning Internet to Run a Business

Yigal continues his line of questioning, pushing the viewers to consider even the most extreme scenarios. “Even let’s say we take the assumption that everything is on the cloud,” he proposes. “Everything is on the cloud…email, applications, whatever.”

But then, he throws a wrench in the works. “So let’s say you lose connectivity somehow…it happens,” Yigal warns. “Some hardware failure or internet connectivity…the ISP is down…whoever it is, it doesn’t matter the name, really, and it’s down.”

He paints a vivid picture of the potential chaos. “What would you do?” he asks. “How long is it going to take you to get a replacement connection to the internet so your office can function?” Yigal emphasizes that whether you’re on the cloud or not, you still need a functioning internet connection to run your business.

He then brings up another critical point: “If you need to restore from the cloud, and this is the only restore that you have, what kind of connection do you have to the internet?” He questions whether their bandwidth would be sufficient for a speedy recovery, pointing out that it could take days or even weeks to download all the necessary data.

Real-world Disaster Scenarios

Yigal transitions to discussing disaster scenarios to stimulate the viewer’s thinking. He focuses on a few basic scenarios. 

1. Case: Host Hardware Failure

Yigal describes a common situation: a failed hard drive causing a computer to crash. 

He suggests having a spare hard drive on hand for quick replacement and emphasizes the importance of having a reliable source to restore data and images. “Do I have somewhere to restore to this new hard drive?” he asks. Even if no data is lost, rebuilding the machine with the operating system, applications, and updates can still be time-consuming.

Yigal then brings up the client’s case, where the issue wasn’t with the host but with the storage device. This poses the challenge of recovering the virtual machines stored on the failed device. He asks, “How can we spin those virtual machines? Can we restore those virtual machines? How long is it going to take us?”

Yigal introduces another critical question: In the event of a host failure in a virtualized environment, is there another capable host available? He stresses the importance of having a host with sufficient resources—virtual CPUs, storage, and memory—to accommodate all the virtual machines, particularly those running resource-intensive applications like SQL databases.

“Do we have that capability in-house?” Yigal asks. If not, he urges the viewers to carefully consider how they would address this issue.

2. Case: Ransomware Attack/Network Breach Affecting All Servers & Workstations

Yigal then moves on to the second scenario: a ransomware attack or network breach affecting all systems. He compares this to the storage failure scenario, highlighting that in both cases, data becomes inaccessible. The question remains: “What do you do now?”

Yigal realizes that even local backups can be encrypted by ransomware, regardless of whether they’re stored on a NAS device. He believes that network segmentation is an additional layer of protection. By separating networks, the impact of a breach can be limited, thus preventing ransomware from spreading to all systems.

Returning to the recovery process, Yigal poses the question of whether virtual machines can be spun up from another host. He points out that even if this is possible, restoring the data and getting those virtual machines running can take time. Plus, there might be compatibility issues that need to be addressed.

As Yigal nears the end of his presentation, he encourages the viewers to carefully consider the information he’s shared and make informed decisions to protect their own data and systems.

Next Steps 

Yigal concludes by outlining the next steps for the audience. He encourages them to have conversations with their IT managers or managed service providers (MSPs) to ask critical questions about their backup and recovery capabilities. These questions should focus on recovery timeframes, the ability to test backups and disaster recovery scenarios through simulations, and the functionality of existing backups.

Yigal calls attention to drafting a plan, building it, testing it, and using feedback to improve the system continuously. He stresses that everyone, including himself and his team, should adhere to this principle.

He shares his experience with the client discussed in the case study, where they had to restore all the data from their cloud backup to a storage device and physically transport it to the client’s location. This process involved setting up a new host, switch, and storage device to get the client’s systems back online. Despite some challenges, they were successful in restoring the data and getting the business operational within five days.

Yigal contrasts this with another case involving a jewelry store that suffered a ransomware attack. It took them 71 days to restore their systems, and they even had to manually re-enter all their orders.

He says having good partners can help with the recovery process in the event of a disaster. “Use that feedback in order to build a better system,” Yigal advises. “Improve the system, see where you’re lacking, and get partners, good partners, that you know that if something happens, they can take whatever they need in order to go to you and do the restore and bring you back to business.”

For more in-depth discussions, expert insights, and real-world case studies to enhance your knowledge and protect your digital assets, don’t miss the next episode of The Cybersecurity Insider podcast

Subscribe and watch now on YouTube, Apple Podcasts, and Spotify.

Yigal Behar – Host

#TheCybersecurityInsider

podcast@TheCybersecurityInsider.com

Share this article with a friend

Related Posts

Hackers Want Your Data - Meet The Ones Who Are Trying To Protect It | 2Secure Corp

Hackers Want Your Data - Meet The Ones Who Are Trying To Protect It | 2Secure Corp

In this Cybersecurity Insider podcast episode, host Yigal Behar focuses on how hackers target valuable data.  Yigal, a seasoned cybersecurity…
Dell Data Breach! OMG

Dell Data Breach! OMG

Today we have discussed successful and unsuccessful breaches. Today\'s guest Seth Melendez. 1. Dell Customer Database Compromised 2. Library of…
The Seven Cybersecurity Challenges in 2024 | 2Secure Corp

The Seven Cybersecurity Challenges in 2024 | 2Secure Corp

Seven Cybersecurity challenges will continue to evolve and present new threats to individuals, businesses, and governments alike. Here are some…

Create an account to access this functionality.
Discover the advantages