• @brucethemoose
    link
    28
    edit-2
    12 hours ago

    My friend, the Chinese have been releasing amazing models all last year, it just didn’t make headlines.

    Tencent’s Hunyuan Video is incredible. Alibabas Qwen is still a go to local model. I’ve used InternLM pretty regularly… Heck, Yi 32B was awesome in 2023, as the first decent long context local model.

    …The Janus models are actually kind of meh, unless you’re captioning images, and FLUX/Hunyuan Video is still king in diffusion world.

    • λλλ
      link
      fedilink
      14 hours ago

      Any use for programming? Preferably local hosting only?

      • @brucethemoose
        link
        1
        edit-2
        3 hours ago

        I mean, if you have huge GPU, sure. Or at least 12GB free vram or a big Mac.

        Local LLMs for coding is kinda a niche because most people don’t have a 3090 or 7900 lying around, and you really need 12GB+ free VRAM for the models to start being “smart” and even worth using over free LLM APIs, much less cheap paid ones.

        But if you do have the hardware and the time to set a server up, the Deepseek R1 models or the FuseAI merges are great for “slow” answers where the model thinks things out for replying. Qwen 2.5 32B coder is great for quick answers on 24GB VRAM. Arcee 14B is great for 12GB VRAM.

        Sometimes running a small model on a “fast” less vram efficient backend is better for stuff like cursor code completion.