• @brucethemoose
    link
    English
    110 hours ago

    Dense models that would fit in 100-ish GB like mistral large would be really slow on that box, and there isn’t a SOTA MoE for that size yet.

    So, unless you need tons of batching/parallel requests, its… kinda neither here nor there?

    As someone else said, the calculus changes with cheaper Strix Halo boxes (assuming those mini PCs are under $3K).