Curious to know how many people do zero-downtime deployment of backend code and how many people regularly take their service down, even if very briefly, to roll out new code.

Zero-downtime deployment is valuable in some applications and a complete waste of effort in others, of course, but that doesn’t mean people do it when they should and skip it when it’s not useful.

  • @none
    link
    English
    11 year ago

    Zero downtime here. We use ECS with services sitting behind ALBs. At deploy time we spin up a new task sets, wait for the new tasks to become healthy and then direct a small amount of traffic towards the new set for evaluation. If no alarms go off due to a degradation in metrics, the amount of traffic is increased until the old version has 0 traffic. After a period of time to allow for instant rollbacks if necessary, the old version is shut down.

    What’s more interesting to me is ways to accomplish the same thing for things that aren’t just web services where it’s trivial to direct traffic to one version or the other. For example, if you have workers consuming a queue, I haven’t found a way to gradually increase the amount of work available to the new version without implementing custom application logic (I work on a platform with thousands of services, so I’m looking for ways to do it on an infrastructure level rather than each service implementing something).