• @cyd
      link
      English
      15
      edit-2
      10 hours ago

      Base models are general purpose language models, mainly useful for AI researchers and people who want to build on top of them.

      Instruct or chat models are chatbots. They are made by fine-tuning base models.

      The V3 models linked by OP are Deepseek’s non-reasoning models, similar to Claude or ChatGPT4o. These are the “normal” chatbots that reply with whatever comes to their mind. Deepseek also has a reasoning model, R1. Such models take time to “think” before supplying their final answer; they tend to give better performance for stuff like math problems, at the cost of being slower to get the answer.

      It should be mentioned that you probably won’t be able to run these models yourself unless you have a data center style rig with 4-5 GPUs. The Deepseek V3 and R1 models are chonky beasts. There are smaller “distilled” forms of R1 that are possible to run locally, though.

      • @DogWater
        link
        English
        23 hours ago

        I heard people saying they could run the r1 32B model on moderate gaming hardware albeit slowly

    • @[email protected]
      link
      fedilink
      English
      -210 hours ago

      r1 is lightweight and optimized for local environments on a home PC. It’s supposed to be pretty good at programming and logic and kinda awkward at conversation.

      v3 is powerful and meant to run on cloud servers. It’s supposed to make for some pretty convincing conversations.

      • Pennomi
        link
        English
        510 hours ago

        R1 isn’t really runnable with a home rig. You might be able to run a distilled version of the model though!

        • @theunknownmuncher
          link
          English
          49 hours ago

          Tell that to my home rig currently running the 671b model…

          • Pennomi
            link
            English
            59 hours ago

            That likely is one of the distilled versions I’m talking about. R1 is 720 GB, and wouldn’t even fit into memory on a normal computer. Heck, even the 1.58-bit quant is 131GB, which is outside the range of a normal desktop PC.

            But I’m sure you know what version you’re running better than I do, so I’m not going to bother guessing.

              • Pennomi
                link
                English
                28 hours ago

                You must have a lot of memory, sounds like a lot of fun!

        • @[email protected]
          link
          fedilink
          English
          18 hours ago

          You’re absolutely right, I wasn’t trying to get that in-depth, which is why I said “lightweight and optimized,” instead of “when using a distilled version” because that raises more questions than it answers. But I probably overgeneralized by making it a blanket statement like that.