• @Finadil
    link
    English
    11 year ago

    That with a fp16 model? Don’t be scared to try even a 4 bit quantization, you’d be surprised at how little is lost and how much quicker it is.