• @Finadil
    link
    English
    17 months ago

    That with a fp16 model? Don’t be scared to try even a 4 bit quantization, you’d be surprised at how little is lost and how much quicker it is.