Lugh@futurology.todayM to Futurology@futurology.todayEnglish · 11 months agoMultiple LLMs voting together on content validation catch each other’s mistakes to achieve 95.6% accuracy.arxiv.orgexternal-linkmessage-square25linkfedilinkarrow-up153arrow-down17
arrow-up146arrow-down1external-linkMultiple LLMs voting together on content validation catch each other’s mistakes to achieve 95.6% accuracy.arxiv.orgLugh@futurology.todayM to Futurology@futurology.todayEnglish · 11 months agomessage-square25linkfedilink
minus-squarePennomilinkfedilinkEnglisharrow-up3·11 months agoA lot of the smaller LLMs don’t require GPU at all - they run just fine on a normal consumer CPU.
minus-squarecopygirl@lemmy.blahaj.zonelinkfedilinkEnglisharrow-up3·11 months agoWouldn’t running on a CPU (while possible) make it less energy efficient, though?
minus-squarePennomilinkfedilinkEnglisharrow-up3·11 months agoIt depends. A lot of LLMs are memory-constrained. If you’re constantly thrashing the GPU memory it can be both slower and less efficient.
minus-squareDavidGarcia@feddit.nllinkfedilinkEnglisharrow-up1·11 months agoyeah but 10x slower, at speeds that just don’t work for many use cases. When you compare energy consumption per token, there isn’t much difference.
A lot of the smaller LLMs don’t require GPU at all - they run just fine on a normal consumer CPU.
Wouldn’t running on a CPU (while possible) make it less energy efficient, though?
It depends. A lot of LLMs are memory-constrained. If you’re constantly thrashing the GPU memory it can be both slower and less efficient.
yeah but 10x slower, at speeds that just don’t work for many use cases. When you compare energy consumption per token, there isn’t much difference.