A global tech outage has shut down many of the world’s services. Airlines were grounded, new channels unable to go live, banks went offline and even 911 operators were impacted by the outage. Many operators woke up to find that their computers wouldn’t even start. This massive outage was caused by a single software update originating from the cybersecurity firm CloudStrike. The flawed update resulted in many devices running Windows to experience the “Blue Screen of Death”. Instead of computers booting up as usual, they were crashing instead.
Cloudstrike is one of the world’s biggest cybersecurity companies. They develop software that helps companies to detect and prevent hacks. Their tech is widely used by companies worldwide, managing the security of devices running Windows.
The outage stemmed from a popular software called Falcon Sensor which CloudStrike provides. The Falcon is an antivirus software that secures “endpoints” like laptops, servers, mobile devices, and point-of-sale systems. To secure these “endpoints”, CloudStrike needs to have deep-level access to the device’s operating system. This allows the software to monitor for malicious software or suspicious activity. Deep-level access like this is known as kernel access which refers to the core level of a computer’s operations. Where interactions between hardware and software take place. Many cybersecurity software’s need this highly privileged access so that they can monitor any part of a computer that hackers may attack. According to IT analysts, the defective update affected the kernel access driver and is interacting with the Windows operating system. Ultimately, causing computers to be stuck in an endless re-boot.
The outage affected some of the world’s industries such as healthcare and aviation. Canadian healthcare systems were offline which resulted in some workers using pen and paper for some tasks. Surgeries were also canceled throughout many hospitals and countries. International and domestic flights took a huge blow grounding almost all flights for varying lengths of time. Airlines reported that the outage affected many systems like ones that check-in passengers, calculate aircraft weight, and communicate with crews in the air.
The update was a routine update for CloudStrike so the outage came as a surprise to everybody. There aren’t many ways for a company to predict faulty updates much like this one but there is a method to minimize the damage. Instead of mass updating their software for all endpoints they can do it in sections. For example, if the software affects 8 million devices, instead of updating the software for all 8 million, they update it for 1 million in sections. This means that by updating 1 million “endpoints”/devices first, and it happens to be a faulty update much like Falcon Sensor only 1 million devices are affected. This is significantly less than the total 8 million running the software. Once the company finds that the update is faulty, they can find out why the update was faulty and potentially fix it before finishing the software update for all endpoints.
Companies continue to rebalance after the outage, many will take some time to return to normal. CloudStrike is also working hard to develop some patches. However, it will not resolve the outage immediately.
Sources:
https://www.cbc.ca/news/world/cyberstrike-worldwide-outage-1.7268863