• Smorty [she/her]
    link
    fedilink
    13 months ago

    Something similar to this already kinda exists on HF with the 1.58 bit quantisation which seem to get very similar performance to the original Llama 3 8B model. That’s essentially a two bit quanitsation with reasonable performance!