• @lp0101
    link
    541 year ago

    People fail to understand that large projects have inertia. He could have shuttered all twitter offices, fired all employees, and only paid the server bills, and the website would probably continue to function just fine for a few months.

    But as a devops/SRE, this whole saga has been awesome to watch

    • @geekworking
      link
      121 year ago

      Weren’t they not paying their AWS bills for a while?

      • @jyter
        link
        111 year ago

        Back in March it was reported they weren’t paying their AWS bill. Two weeks ago it was reported they weren’t paying their GCP bill either.

      • @jyter
        link
        11 year ago

        deleted by creator

    • zalack
      link
      fedilink
      31 year ago

      And often the tipping point is invisible. Some small routine or service degrades, but outwardly everything still works fine… there is just more strain on the services and clients that use that service, causing them to slowly degrade over the next few hours, days, or weeks, which in turn puts more strain on the services that call those services… etc etc.

      Until one day the system is so degraded major things start breaking. It seems like it came out of nowhere, but the initial failure happened weeks ago and has been cascading since then.

      Once a system hits that point it’s often not enough to just fix the initial problem because so much of the ecosystem around it has been thrown out of whack.

      • Spaceman2901
        link
        fedilink
        11 year ago

        See the film Passengers for an example of cascade failures from systems trying to cover for each other.

        • zalack
          link
          fedilink
          11 year ago

          The Expanse has a whole b-plot about an artificial ecosystem going through cascade failure in one of its arcs.

    • lowdownfool
      link
      fedilink
      11 year ago

      As a way-too seasoned web developer who appreciates working alongside great SREs, this has been pretty interesting. I’m honestly surprised more hasn’t gone wrong but maybe that’s yet to come. Since they are (I imagine) losing users instead of growing it might actually avoid running into future scaling issues that were looming.