An angry admin shares the CrowdStrike outage experience

lemme in · 7 months ago

An angry admin shares the CrowdStrike outage experience

db0 · 7 months ago

Pity the administrators who dutifully kept a list of those keys on a secure server share, only to find that the server is also now showing a screen of baleful blue.

Lol, can you imagine? It empathetically hurts me even thinking of this situation. Enter that brave hero who kept the fileshare decryption key in a local keepass :D

@[email protected] · edit-2 7 months ago

That’s why the 3-2-1 rule exists:

3 copies of everything on
2 different forms of media with
1 copy off site

For something like keys, that means:

secure server share
server share backup at a different site
physical copy (either USB, printed in a safe, etc)

Any IT pro should be aware of this “rule.” Oh, and periodically test restoring from a backup to make sure the backup actually works.

@IphtashuFitz · 7 months ago

We have a cron job that once a quarter files a ticket with whoever is on-call that week to test all our documented emergency access procedures to ensure they’re all working, accessible, up-to-date etc.

@[email protected] · 7 months ago

Are you hiring!?

@kescusay · 7 months ago

Seems like an argument for a heterogeneous environment, perhaps a solid and secure Linux server to host important keys like that.

@[email protected] · 7 months ago

Linux can shit the bed too. You need to maintain a physical copy.

@[email protected] · 7 months ago

Their point is not that linux can’t fail, it’s that a mix of windows and linux is better than just one. That’s what “heterogeneous environment” means.

You should think of your network environment like an ecosystem; monocultures are vulnerable to systemic failure. Diverse ecosystems are more resilient.

@[email protected] · 7 months ago

Sure but the chances of your Windows and Linux machines shitting the bed at the same time is less than if everything is running Windows. It’s exactly the same reason you keep a physical copy (which after all can break/burn down etc.) - more baskets to spread your eggs across.

@[email protected] · 7 months ago

Very few businesses are going to spend the money running redundant infrastructure on two different operating systems. Most of them won’t even spend the money on a proper DR plan.

@[email protected] · 7 months ago

Then they get to suffer the consequences when shit like this happens

@[email protected] · 7 months ago

Then they get to suffer the consequences when shit like this happens

Oh, they are.

@noobface · 7 months ago

Hey Ralph can you get that post-it from the bottom of your keyboard?

@StaySquared · 7 months ago

CS did take down Linux a few years back… I forget the exact details.

@Avatar_of_Self · 7 months ago

Yes, but has it taken both OS’ out at the same time? It hasn’t but it could happen, however, the chances are even less. There’s obvious risk mitigation in mixing vendors in infrastructure for both hardware and software in the enterprise.

If some critical services were lost in your enterprise last time until RH updated their kernel then you could have benefitted from running that service from Windows as well. Now the reverse is true. You could have another DC via Samba on Linux in your forest if you wanted to, in order to have an AD still for example. Same goes for file share servers, intermediary certificate servers (hopefully your Root CA is not always on the network) and pretty much most critical services.

Most enterprises run a lot of services off of a hypervisor and have overhead to scale (or they are already in a sinking ship), so you can just spin up VMs to do that. It isn’t as if it is unreasonably labor intensive compared to other similar risk mitigation implementations. Any sane CCB (obviously there are edge cases but we are talking in general here) will even let you get away without a vendor support contract for those, since they are just for emergency redundancy and not anywhere near critical unless the critical services have already shit the bed.

Amanda · 7 months ago

Sounds like we may have an easier conclusion to draw here