A university near me must be going through a hardware refresh, because they’ve recently been auctioning off a bunch of ~5 year old desktops at extremely low prices. The only problem is that you can’t buy just one or two. All the auction lots are batches of 10-30 units.

It got me wondering if I could buy a bunch of machines and set them up as a distributed computing cluster, sort of a poor man’s version of the way modern supercomputers are built. A little research revealed that this is far from a new idea. The first ever really successful distributed computing cluster (called Beowulf) was built by a team at NASA in 1994 using off the shelf PCs instead of the expensive custom hardware being used by other super computing projects at the time. It was also a watershed moment for Linux, then only a few yeas old, which was used to run Beowulf.

Unfortunately, a cluster like this seems less practical for a homelab than I had hoped. I initially imagined that there would be some kind of abstraction layer allowing any application to run across all computers on the cluster in the same way that it might scale to consume as many threads and cores as are available on a CPU. After some more research I’ve concluded that this is not the case. The only programs that can really take advantage of distributed computing seem to be ones specifically designed for it. Most of these fall broadly into two categories: expensive enterprise software licensed to large companies, and bespoke programs written by academics for their own research.

So I’m curious what everyone else thinks about this. Have any of you built or admind a Beowulf cluster? Are there any useful applications that would make it worth building for the average user?

  • @[email protected]
    link
    fedilink
    711 months ago

    It really depends on what sort of workload you want to run. Most programs have no concept of horizontal scaling like that, and those that do usually deal with it by just running an instance on each machine.

    That said, if you want to run lots of different workloads at the same time, you might want to have a look at something like Kubernetes. I’m not sure what you’d want to run in a homelab that would use even 10 machines, but it could be fun to find out.

    • @plenipotentprotogodOP
      link
      311 months ago

      I’m not sure what you’d want to run in a homelab that would use even 10 machines, but it could be fun to find out.

      Oh yeah, this is absolutely a solution in search of a problem. It all started with the discovery that these old (but not ancient, most of them are intel 7th gen) computers were being auctioned off for like $20 a piece. From there I started trying to work backwards towards something I could do with them.

      • Lettuce eat lettuce
        link
        fedilink
        6
        edit-2
        11 months ago

        There are several more practical uses for old PCs like that imo.

        You could grab a few of them, throw some 2nd hand GPUs in, clean them out and install Bazzite/ChimeraOS/Holo. Turn them into affordable Steam Consoles. Sell or give them away to friends, family, online etc.

        You could also refurbish them and donate them to a school or community center that is underfunded, which would be pretty cool.

        Use them as home media PCs, or build a homelab and use them as servers for different tasks. Use one as a NAS, another as a Hypervisor for VMs, another as a PFsense/Opnsense router/firewall, etc.

        Or just goof around and build a janky but badass cluster lol. When they are that cheap, almost anything you use them for is better value than they are as e-waste.

      • @[email protected]
        link
        fedilink
        311 months ago

        They sound usable enough. If you’re interested in it, have you considered running a LLM or similar? I think they cluster. If they’ve got GPUs you could try Stablediffusion too.

        Mind you, at that price point I think we’re past the point of just thinking of them as compute resources. Use them as blocks, build a fort and refuse to come out unless someone comes up with a better idea.

        • @plenipotentprotogodOP
          link
          211 months ago

          I’ll have to look a little more into the AI stuff. It was actually my first thought, but I wasn’t sure how far I’d get without GPUs. I think they’re pretty much required for Stablediffusion. I’m pretty sure even LLMs are trained on GPUs, but maybe response generation can be done without one.

          • AggressivelyPassive
            link
            fedilink
            211 months ago

            Not really, at least not in a useful way. I have an i5 6500 in an old Dell desktop and even with 16gb of RAM, you can’t really do that much without waiting forever. My m1 air is way faster.