• @[email protected]
    link
    fedilink
    English
    611 days ago

    For the small ones, with GPUs a couple hundred watts when generating. For the large ones, somewhere between 10 to 100 times that.

    With specialty hardware maybe 10x less.

    • Pennomi
      link
      English
      311 days ago

      A lot of the smaller LLMs don’t require GPU at all - they run just fine on a normal consumer CPU.

      • copygirl
        link
        fedilink
        English
        311 days ago

        Wouldn’t running on a CPU (while possible) make it less energy efficient, though?

        • Pennomi
          link
          English
          311 days ago

          It depends. A lot of LLMs are memory-constrained. If you’re constantly thrashing the GPU memory it can be both slower and less efficient.

      • @[email protected]
        link
        fedilink
        English
        110 days ago

        yeah but 10x slower, at speeds that just don’t work for many use cases. When you compare energy consumption per token, there isn’t much difference.