• @bassomitron
    link
    English
    28 months ago

    Do you know if there are any plans to quantize it? I’d love to test it, but my 3090 can’t handle 70b models without quantization, unfortunately.

    • midnight
      link
      fedilink
      3
      edit-2
      8 months ago

      There are quantized versions on hugging face. There’s a q2 version, but idk how well that performs

    • ffhein
      link
      English
      28 months ago

      Only quantized versions of the model were leaked. If you see any unquantized version of it then it’s something which was recreated from these, and not the original model. People have also requanted it from GGUF to EXL2 and probably other formats too.