Abacus.ai:

We recently released Smaug-72B-v0.1 which has taken first place on the Open LLM Leaderboard by HuggingFace. It is the first open-source model to have an average score more than 80.

  • @[email protected]
    link
    fedilink
    English
    011 months ago

    I’m pretty sure you can load the model using RAM like another poster said. Here’s a used server under $600 that could theoretically run it: ebay.

    • @[email protected]
      link
      fedilink
      English
      511 months ago

      You would want to look for an R730, which can be had for not too much more. The 20 series was the “end of an era” and the 30 series was the beginning of the next era. Most importantly for this application, R30s use DDR4 whereas R20s use DDR3.

      RAM speed matters a lot for ML applications and DDR4 is about 2x as fast as DDR3 in all relevant measurements.

      If you’re going to offload any part of these models to CPU, which you 99.99% will have to do for a model of this size with this class of hardware, skip the 20s and go to the 30s.