• @[email protected]
    link
    fedilink
    English
    62 days ago

    I picked up a pair of old Tesla P40s. Right now I’m running a Q4 quant of Qwen 2.5 72B that fits in the combined 48GB of VRAM with 12k context. They aren’t as fast as newer consumer cards, but it generates as fast as I can read while costing less than a used 3080.

    • @BatrickPateman
      link
      English
      11 day ago

      interesting. They are cooled passively, right? What’s your case and cooling setup?

      • @[email protected]
        link
        fedilink
        English
        119 hours ago

        I have a dell power edge 730, which was about $200. It’s CPU shrouds perfectly match the GPU intakes so air just flows through both from the server fans. I’ve seen a few 3d printable fan mounts for jury rigging them into a regular tower too.