How Kelsey-Seybold Clinic recovered from a ransomware attack
Photo: Kelsey-Seybold Clinic
This past year, ransomware attacks cost healthcare organizations more than $20 billion, according to a study from Comparitech. Even more critical, however, is the risk to patient care and continuity.
Data backup remains essential. Today, however, rapid restore is equally important to defend and recover from ransomware and other malicious attacks.
Martin Littmann, senior director, chief technology officer and chief information security officer at Houston-based Kelsey-Seybold Clinic, has more than 30 years of experience in healthcare and IT. He knows firsthand what's needed to successfully defend and recover.
After it experienced a ransomware incident, the clinic shifted its security strategy, creating an environment of immutable data snapshots and backups. Healthcare IT News interviewed Littman, who shared his expertise on the matter.
Q. Please talk about the ransomware incident you experienced. When did it happen? How did the hackers take control? What was affected? And how did you resolve it?
A. It may surprise folks to know that ransomware has been around a long time – with the first documented incident occurring in 1989. Based on one article I read, there were quite a few variants in existence by 2015, which was the same year we experienced our incident.
Two employees working in the same department visited a day care site to look at their services during lunchtime. That site was built on WordPress and was not kept current. The malware they received as a drive-by download was a zero-day variant of Crypto Locker. Our FireEye appliance notified us of the malware at the same time as users calling to report they were unable to access certain files on a network share.
We were able to quickly identify the two infected machines. One was a physical PC, and one was a virtual desktop. The virtual desktop was rebooted to a clean image and the physical machine was taken off the network and re-imaged. The systems and storage team was able to quickly identify the extent of the impact: hundreds of thousands of encrypted files across two department shares.
After several discussions with the executives over the area impacted, we decided to work through the day and perform remediation that evening. We first restored affected files from snap backups so users could continue their business processes. At the end of the day, we restored the entire file shares followed by backups of the files revised throughout the business day.
Q. What are a couple lessons you learned from that ransomware incident?
A. The event highlighted the need for the information security team to be vigilant in reviewing and responding to alerts from our security solutions. It also illuminated the value of the information security, network and systems teams working in harmony, and underscored the reality that security is everyone's business.
In subsequent years, this event was used to highlight the need for richer and more frequent user education, as well as bolstering and continually improving our security and systems tools.
Too often, security teams are flooded with alerts from various tools and systems. Without an effective tool or process or increased "eyes on glass," there is a risk of missing critical alerts. We ultimately upgraded our SIEM approach by employing a data lake-based tool with significantly improved AI and analytic capabilities.
We also were able to increase the number of infrastructure, network and security systems feeding the SIEM. With this tool, we were able to fine-tune alerts to ensure critical alerts were not missed, and lower false positives. We then further improved our process by the addition of a SOAR (security orchestration, automation and response) to ensure the security team could triage and respond to alerts and we had a record of that accountability.
Additionally, we invested time evaluating open shares and over-provisioned user access to limit exposure in any actual malware event.
We have stepped up our user education program. On the one hand, this includes periodic phishing testing. We also leverage current security and privacy news items about breaches and healthcare fines to remind executives and leaders of the need for the organization to be educated and vigilant.
Monthly, we also send out an information security tips newsletter discussing current types of attacks and remediations and precautions we can take as a business and as individuals.
Q. You implemented immutable snapshots and backups across Kelsey-Seybold. What is this technology, and how does it work to protect your systems and data?
A. We have a blended mix of storage technologies and a solid data protection solution we have leveraged and upgraded since 2007. This strategy was developed based on this solution set and before most storage vendors had delivered or matured immutable backup approaches. Our backup strategy was developed before immutable snapshots were available in any storage products we used.
We developed a layered approach that relies on primary, backup and archive backups on separate fault domains, as well as local snapshot copies. Each of these relies on separate administrator-level accounts on separate storage systems. The result of this strategy actually creates five copies of our critical EHR database across three different vendor platforms.
The production ODB (open database connectivity) volumes are housed on a purpose-built Pure Storage FlashArray //x50. This is the initial live dataset. The primary volumes are mapped to active/passive frames that are kept in configuration lockstep via Pure Storage HostGroups. This ensures consistent volume mappings during a compute host frame failover.
The database is array snapped four times a day using a snapshot process that leverages a freeze/snap/thaw script workflow that was developed in collaboration with Pure Storage engineers and in-house teams. A second copy is replicated to a secondary FlashArray.
The arrays are linked via FlashArray Async-Replication over redundant 10G links with replication schedules controlled through protection groups. Copies three and four are sent to another backup proxy and to tape. The fifth copy of the database is application-mirrored to a disaster recovery instance. The DR instance lives on separate compute hardware and a separate tertiary FlashArray.
In our future implementations and evolution, we will look at adding in native storage data protection solutions, such as the immutable capabilities now in the market.
Q. What are the data center business continuity security considerations you have, and what are you doing about them?
A. For many years, our executive leadership was content with the level of performance, availability and reliability of our systems and infrastructure delivered through our top tier network and high availability data center.
But as our business continued to grow and with significant future growth projections, we all came to the realization that a single data center represents a business risk that should be mitigated regardless of how redundant and resilient any implementation may be. This realization led to approval of establishing a second backup/recovery data center.
In today's environment, this decision also required an evaluation of a cloud-based approach to backup and recovery. We studied the technical aspects of cloud-based infrastructure and application solutions on the market and how these fit with our current premise-based solutions and hybrid cloud application solutions.
A significant consideration was conversion from premise-hosted virtual desktops, as well as key clinical applications. In the final analysis, we determined the technical limitations and opportunities are not quite there for our needs to move to a full cloud implementation.
Rather, we decided to continue with our current hybrid model: premise-based compute and storage for critical applications and department/user file shares, combined with cloud-based email and collaboration in Office 365.
We will continue to develop Power Apps, leverage Azure and migrate some data backup/archive to S3 instances. Our cloud utilization will continue to grow as more applications mature to include cloud components or shift entirely to cloud architectures.
Eventually this shift will include moving to desktop-as-a-service, and that shift will drastically affect the premise footprint of storage and compute.
Twitter: @SiwickiHealthIT
Email the writer: bsiwicki@himss.org
Healthcare IT News is a HIMSS Media publication.