A security update gone wrong caused a major IT outage, impacting millions of Microsoft devices worldwide. As organisations continue to restore their services, how can we better prepare for future disruptions?

Lessons learned from the CrowdStrike global IT outage

On the early hours of Friday 19 July 2024, organisations worldwide had awoken to a high-profile IT outage that impacted Microsoft services.

It was reported that the glitch affected Windows devices showing the infamous blue screen of death (BSODs).

Others reported a recovery screen, which still guided users to restart their PC.

Microsoft was quickly made aware of the issue, which was quickly tracked to a cyber security company called CrowdStrike. While the initial problem was investigated and resolved fairly quickly, the impact of this outage affected organisations for days, marking this incident as one of the worst cyber events in history since the WannaCry cyber-attack in May 2017.

 

Who was affected by the global IT outage?

It was reported that over 8.5 million Microsoft devices were affected by the global IT outage, impacting organisations around the world. Airports around the world went into chaos as more than 1,000 flights were cancelled and delays as well as huge queues erupting at airports.

The NHS and GPs were also struggling to access their records systems, online bookings and repeat prescriptions.

Additional disruptions were reported across banks, broadcasters, transport and retailers.

It’s worth noting that MacOS and Linux devices were not affected.

 

What caused the IT outage?

The global IT outage was quickly tracked to one of the world’s most known cyber security companies called CrowdStrike, who specialise in Endpoint Detection and Incident Response (EDR).

It appears that CrowdStrike released a configuration update for windows hosts in the hours of Thursday evening. The update was designed to protect Microsoft Windows devices from malicious attacks.

Unfortunately, after investigation, a defect was discovered in the update, causing a glitch where all Windows devices restarted, without warning, during the startup process. Since the reboot couldn’t be completed, the reboot cycle continued. This is often known as a boot loop. This resulted in a system crash and the blue screen of death (BSOD).

Blue Screen of Death (BSoD)

After initial reports were released of the incident, CrowdStrike and Microsoft worked to investigate the issue and implemented remediation steps to all affected devices.

While this incident could have been prevented by CrowdStrike, it could happen to essentially any solution provider.

Businesses cannot be completely immune to downtime. However, it is crucial to have a well-established business continuity plan in place for such incidents, such as the CrowdStrike IT outage. This will ensure that your business can get back up and running as soon as possible without the loss of data.

 

Preparing for the wave of hacking attempts

As the recent CrowdStrike global IT outage continues to impact organisations, cybercriminals are seizing the opportunity to exploit fear and uncertainty. It’s important to remain vigilant during these challenging times and continue to be wary of phishing attempts in the form of emails, calls and text messages.

Be cautious of downloading workarounds from unofficial websites and follow the guidance of IT representatives you trust or follow the guidance of CrowdStrike.

 

Obtain IT resilience through business continuity

The recent CrowdStrike and Microsoft global IT outage will not be the first or last IT major incident to ever occur.

In today’s technology-dependent world, it’s essential for businesses to have measures in place for risk management and business continuity in case of any major incidents, both technical and non-technical.

Here are some steps that your business can take to reduce the risk of a complete shutdown of services:

  • Risk Management and Business Continuity – Revisit your current risk management strategies. Identify vulnerabilities, assess potential impact, and create contingency plans for major disruptions.
  • Sufficient Back-up strategy – Ensure that you have multiple back-up files and locations ready to deploy if needed. The most recent suggestion is to follow the 4-3-2 back-up rule (4 copies in 3 locations, two of which should be off-site).
  • Proactive and Regular Testing – Ensure regular testing of your business continuity plan has taken place, including staged rollouts and updates and closely monitor its impact.

Lee Johnson, CISO/CIO of Air IT and MD of Air Sec, the Cyber Security division of Air IT, stated:

In the wake of the CrowdStrike and Microsoft global IT outage, we witnessed first-hand how seemingly routine software updates can cascade into chaos. These disruptive moments serve as stark reminders: business continuity planning is not a luxury; it’s a strategic necessity. Businesses need to prioritise resilience to adapt and thrive in any situation.

Air IT has a large and experienced team of CIOs, CTOs, and CISOs, who provide support and IT consultancy for your business continuity strategy. You can expect automated backups at regular intervals to minimise any risk of data loss. This will save you time, and you will benefit from the peace of mind of knowing that your systems are protected.

 

Strengthen your overall IT resilience

In today’s complex and digital landscape, businesses need to strengthen their resilience strategies to not only withstand unforeseen disruptions but also the relentless evolution of cyber threats.

Businesses must be prepared to adapt to challenges in order to thrive, including developing a proactive mindset to prepare for the worst and investing in technologies for an agile IT environment.

Discover strategies to help improve your IT and cyber security in our blog, ‘Strengthening Your IT Resilience in 2024 & Beyond’.

Alternatively, feel free to contact us for further assistance in enhancing your security posture through cyber resilience.

Strengthening your IT resilience in 2024 & beyond