• Alphane MoonOPM
    link
    21 day ago

    It’s also depends on your use cases. I have 10 GB VRAM and local LLMs work ok for spell/style checking, idea generation and name generation (naming planert clusters thematically in Star Ruler 2).

    • @j4k3
      link
      English
      21 day ago

      I get functional code snippets for around 3 of 4 questions in any major or most minor languages from a local model. I also get good summarized information about code functionality if I paste up to around 1k lines into the context. I also get fun collaborative story writing in different formats using unique themes from my own science fiction universe. I have explored smaller models in hopes of fine tuning before I discovered the utility of a much larger but quantized model. I never use anything smaller than a 70B or 8×7B because there is no real comparison in my experience and uses. On my hardware, these generate a text stream close to my reading pace.

    • @[email protected]
      link
      fedilink
      21 day ago

      Image and video generation is where you see that VRAM bottleneck. I have a 12GB 4070 and it cannot generate any video despite my best efforts/tweaks.