All our servers and company laptops went down at pretty much the same time. Laptops have been bootlooping to blue screen of death. It’s all very exciting, personally, as someone not responsible for fixing it.

Apparently caused by a bad CrowdStrike update.

Edit: now being told we (who almost all generally work from home) need to come into the office Monday as they can only apply the fix in-person. We’ll see if that changes over the weekend…

  • @solomon42069
    link
    English
    72 months ago

    OK, but people aren’t running Crowdstrike OS. They’re running Microsoft Windows.

    I think that some responsibility should lie with Microsoft - to create an OS that

    1. Recovers gracefully from third party code that bugs out
    2. Doesn’t allow third party software updates to break boot

    I get that there can be unforeseeable bugs, I’m a programmer of over two decades myself. But there are also steps you can take to strengthen your code, and as a Windows user it feels more like their resources are focused on random new shit no one wants instead of on the core stability and reliability of the system.

    It seems to be like third party updates have a lot of control/influence over the OS and that’s all well and good, but the equivalent of a “Try and Catch” is what they needed here and yet nothing seems to be in place. The OS just boot loops.

    • @barsquid
      link
      English
      142 months ago

      AFAICT Microsoft is busy placing ads on everything and screen logging user activity instead of making a resilient foundation.

      For contrast: I’ve been running Fedora Atomic. I’m sure it is possible to add some kernel mod that completely breaks the system. But if there was a crash on boot, in most situations, I’d be able to roll back to the last working version of everything.

    • @EnderMB
      link
      English
      7
      edit-2
      2 months ago

      It’s not just Windows, it’s affecting services that people that primarily use other OS’s rely on, like Outlook or Federated login.

      In these situations, blame isn’t a thing, because everyone knows that a LSE can happen to anyone at any time. The second you start to throw stones, people will throw them back when something inevitably goes wrong.

      While I do fundamentally agree with you, and believe that the correct outcome should be “how do we improve things so that this never happens again”, it’s hard to attach blame to Microsoft when they’re the ones that have to triage and ensure that communication is met.

      • @solomon42069
        link
        English
        4
        edit-2
        2 months ago

        I reckon it’s hard to attach blame to Microsoft because of the culture of corporate governance and how decisions are made (without experts).

        Tech has become a bunch of walled gardens with absolute secrecy over minor nothings. After 1-2 decades of that, we have a generation of professionals who have no idea how anything works and need to sign up for $5 a month phone app / cloud services just to do basic stuff they could normally do on their own on a PC - they just don’t know how or how to put the pieces together due to inexperience / lack of exposure.

        Whether it’s corporate or government leadership, the lack of understanding of basics in tech is now a liability. It’s allowed corporations like Microsoft to set their own quality standards without any outside regulation while they are entrusted with vital infrastructure and to provide technical advisory, even though they have a clear vested interest there.

    • @lanolinoil
      link
      English
      22 months ago

      banks wouldn’t use something that black box. just trust me bro wouldn’t be a good pitch