Meta on Tuesday announced the release of Llama 3.1, the latest version of its large language model that the company claims now rivals competitors from OpenAI and Anthropic. The new model comes just three months after Meta launched Llama 3 by integrating it into Meta AI, a chatbot that now lives in Facebook, Messenger, Instagram and WhatsApp and also powers the company’s smart glasses. In the interim, OpenAI and Anthropic already released new versions of their own AI models, a sign that Silicon Valley’s AI arms race isn’t slowing down any time soon.

Meta said that the new model, called Llama 3.1 405B, is the first openly available model that can compete against rivals in general knowledge, math skills and translating across multiple languages. The model was trained on more than 16,000 NVIDIA H100 GPUs, currently the fastest available chips that cost roughly $25,000 each, and can beat rivals on over 150 benchmarks, Meta claimed.

  • sunzu
    link
    fedilink
    32 months ago

    Does anyone know what it takes to run 70b?

    Seems like min 32gb RAM and 4070?

    • @brucethemoose
      link
      English
      22 months ago

      I mean I have a 24GB GPU, and its almost too slow for me. If someone makes an AQLM I may run it some.

      • sunzu
        link
        fedilink
        12 months ago

        You were able to load 70b just into GPU?

        • @brucethemoose
          link
          English
          2
          edit-2
          2 months ago

          Yeah, an AQLM 70B will fit in 24GB with very short context, but decent quality.

          You never hear about it, mostly because it’s so hard to quantize in the first place, but also because it’s not a GGUF so most people ignore the format, lol.