• Alphane MoonOPM
    link
    English
    210 days ago

    It’s really surprising that Microsoft doesn’t see the presence of a powerful GPU as being enough for “Copilot” certification.

    As things stand today, you can do a lot more (consumer-facing) ML tasks with a GPU than any of the NPUs (which tend to have very weak support for things like local LLM, ML video-upscaling, local image AI gen, videogame upscaling.

    • @[email protected]
      link
      fedilink
      110 days ago

      I don’t understand this post. A desktop 4070 has 1:1 FP16 which results in less than 30TOPs. MS requires a minimum of 45TOPs for a device to be co-pilot certified, that’s why they’re not certified. Worse, the limited memory pool make any NV laptop card apart from a 4090 a difficult sell.

      • Alphane MoonOPM
        link
        English
        2
        edit-2
        10 days ago

        I am referring to practical use cases.

        For example how fast would a 45 TOPs NPU ML upscale a 10 min SD video source to HD (takes about 15 min with 3080 + 5800X). What video upscaling frameworks/applications have support for such NPUs?

        Another example would be local LLMs. Are there any LLMs comparable to say llama 3.1 1B that can be run be locally via NPU?

        To my knowledge there are no video gaming upscaling tech (comparable to DLSS) that can be run off a NPU.

        • @[email protected]
          link
          fedilink
          110 days ago

          Both video and DLSS use(d) diffusion to upscale images (DLSS has allegedly ttransitioned to a transformer model). AFAIK there’s no simple way to run diffusion on an NPU as of today.

          Regarding local running LLMs, well, I’ll take an NPU with 32-64 gigs of Ram over an anemic llama 1-3B model run on the GPU. And that’s before considering people using Windows and taking advantage of MS Olive. Llama3.3 70B, which has similar performance to Llama3.2 405B will run on 64GB of ram, ezpz, forget about ever running it on local PC with an NVIDIA card.

          My eyes are set on the strix halo 128GB variant, I’m going to put that through its paces.

          BTW, most of the interesting models will fail to run locally due to NVIDIA’s shit VRAM allowance, if nvidia were giving people a minimum of 16GB of VRAM I’m sure MS would happily certify it.

          • Alphane MoonOPM
            link
            210 days ago

            That’s fair. But do you see where I am coming from?

            Marketing around TOPs isn’t everything.

            Interesting is a relative term. I find upscaling older SD content interesting. You can’t just dismiss this use case because it doesn’t fit into your arguement.

            Getting a local LLM (Llama 1B is not as good as cloud LLMs of course, but it does have valid use cases) with a Nvidia GPU is extremely simple. Can you provide a 5 bullet point guide for setting up a local LLM with 32 GB RAM (64 GB RAM isn’t that common in laptops).

            • @[email protected]
              link
              fedilink
              1
              edit-2
              10 days ago

              Install lmstudio

              Profit

              *If you want to use the NPU

              Apply for beta branch (3.6.x) at lmstudio

              Install lmstudio beta

              Profit

              Edit: Almost forgot, the AMD drivers (under review) for the latest NPU containing CPUs (7xxx and upward) should come with the spring kernel update to 6.3, fingers crossed. It’s been two years, they took their sweet time. Windows support was available on release…