Keep Tier-One Applications Out of Virtual Environments

RedFox · edit-2 4 months ago

Keep Tier-One Applications Out of Virtual Environments

@Im_old · 4 months ago

That article is SO wrong. You don’t run one instance of a tier1 application. And they are on separate DCs, on separate networks, and the firewall rules allow only for application traffic. Management (rdp/ssh) is from another network, through bastion servers. At the very least you have daily/monthly/yearly (yes, yearly) backups. And you take snapshots before patching/app upgrades. Or you even move to containers, with bare hypervisors deployed in minutes via netinstall, configured via ansible. You got infected? Too bad, reinstall and redeploy. There will be downtime but not horrible. The DBs/storage are another matter of course, but that’s why you have synchronous and asynchronous replicas, read only replicas, offsites, etc. But for the love of what you have dear, don’t run stuff on bare metal because “what if the hypervisor gets infected”. Consider the attack vector and work around that.

@thirteene · 4 months ago

You can prevent downtime by mirroring your container repository and keeping a cold stack in a different cloud service. We wrote an loe, decided the extra maintenance wasn’t worth the effort to plan for provider failures. But then providers only sign contracts if you are in their cloud and you end up doing it anyways.

Unfortunately most victims aren’t using best practices let alone industry standards. The author definitely learned the wrong lesson though.

RedFox · 4 months ago

Good comments.

Do you think there’s still a lot of traditional or legacy thinking in IT departments?

Containers aren’t new, neither is the idea of infrastructure as code, but the ability to redeploy a major application stack or even significant chunks of the enterprise with automation and the restoration of data is newer.

@Im_old · 4 months ago

There is so much old and creaky stuff lying around and people have no idea what it does. Beige boxes in a cabinet that when we had to decommission it the only way to understand what it does was doing the scream test: turn it off and see who screams!

Or even stuff that was deployed as IaC by an engineer but then they left and so was managed “clickOps”, but documentation never updated.

When people talk about the Tier1 systems they often forget the peripheral stuff required to make them work. Sure the super mega shiny ERP system is clustered, with FT and DR, backups off site etc. But it talks to the rest of the world through an internal smtp server running on a Linux box under the stairs connected to a single consumer grade switch (I’ve seen this. Dust bunnies were almost sentient lol).

Everyone wants the new shiny stuff but nobody wants to take care of the old stuff.

Or they say “oh we need a new VM quickly, we’ll install the old way and then migrate to a container in the cloud”. And guess what, it never happens.