• Jeena
    link
    fedilink
    English
    106 hours ago

    Exactly, I’m in the same situation now and the 8GB in those cheaper cards don’t even let you run a 13B model. I’m trying to research if I can run a 13B one on a 3060 with 12 GB.

    • @[email protected]
      link
      fedilink
      English
      258 minutes ago

      I’m running deepseek-r1:14b on a 12GB rx6700. It just about fits in memory and is pretty fast.

    • The Hobbyist
      link
      fedilink
      English
      54 hours ago

      You can. I’m running a 14B deepseek model on mine. It achieves 28 t/s.

      • @levzzz
        link
        English
        229 minutes ago

        You need a pretty large context window to fit all the reasoning, ollama forces 2048 by default and more uses more memory

      • Jeena
        link
        fedilink
        English
        54 hours ago

        Oh nice, that’s faster than I imagined.

      • @[email protected]
        link
        fedilink
        English
        12 hours ago

        I also have a 3060, can you detail which framework (sglang, ollama, etc) you are using and how you got that speed? i’m having trouble reaching that level of performance. Thx