Llama 3.1 is Meta's latest salvo in the battle for AI dominance

@IndustryStandard · 7 months ago

Llama 3.1 is Meta's latest salvo in the battle for AI dominance

sunzu · 7 months ago

Does anyone know what it takes to run 70b?

Seems like min 32gb RAM and 4070?

@brucethemoose · 7 months ago

I mean I have a 24GB GPU, and its almost too slow for me. If someone makes an AQLM I may run it some.

sunzu · 7 months ago

You were able to load 70b just into GPU?

@brucethemoose · edit-2 7 months ago

Yeah, an AQLM 70B will fit in 24GB with very short context, but decent quality.

You never hear about it, mostly because it’s so hard to quantize in the first place, but also because it’s not a GGUF so most people ignore the format, lol.