Fault in CrowdStrike caused airports, businesses and healthcare services to languish in ‘largest outage in history’

Services began to come back online on Friday evening after an IT failure that wreaked havoc worldwide. But full recovery could take weeks, experts have said, after airports, healthcare services and businesses were hit by the “largest outage in history”.

Flights and hospital appointments were cancelled, payroll systems seized up and TV channels went off air after a botched software upgrade hit Microsoft’s Windows operating system.

It came from the US cybersecurity company CrowdStrike, and left workers facing a “blue screen of death” as their computers failed to start. Experts said every affected PC may have to be fixed manually, but as of Friday night some services started to recover.

As recovery continues, experts say the outage underscored concerns that many organizations are not well prepared to implement contingency plans when a single point of failure such as an IT system, or a piece of software within it, goes down. But these outages will happen again, experts say, until more contingencies are built into networks and organizations introduce better back-ups.

  • @fishpen0
    link
    12 months ago

    It’s not the best tool out there. It’s the laziest one that works. It’s perfectly possible to securely operate without a rootkit hacked into your kernel.

    Modern approaches involve running an ebpf module on rootless immutable images that are scanned on build. My org is PCI, SOC2, and HITRUST and we didn’t go down because we would never take such a sad lax approach to hand off responsibility for security to a third party. The trade off is your head of compliance and security need to actually learn things and work hard to push alternatives with auditors and consultants and most companies put an MBA who can’t critically think their way out of an empty room at the helm.