• NVIDIA released a demo version of a chatbot that runs locally on your PC, giving it access to your files and documents.

• The chatbot, called Chat with RTX, can answer queries and create summaries based on personal data fed into it.

• It supports various file formats and can integrate YouTube videos for contextual queries, making it useful for data research and analysis.

  • @[email protected]
    link
    fedilink
    English
    159 months ago

    Pretty much every LLM you can download already has CUDA support via PyTorch.

    However, some of the easier to use frontends don’t use GPU acceleration because it’s a bit of a pain to configure across a wide range of hardware models and driver versions. IIRC GPT4All does not use GPU acceleration yet (might need outdated; I haven’t checked in a while).

    If this makes local LLMs more accessible to people who are not familiar with setting up a CUDA development environment or Python venvs, that’s great news.

    • @CeeBee
      link
      English
      29 months ago

      Ollama with Ollama WebUI is the best combo from my experience.

    • ɐɥO
      link
      fedilink
      English
      19 months ago

      Gpt4all somehow uses Gpu acceleration on my rx 6600xt

      • @[email protected]
        link
        fedilink
        English
        19 months ago

        Ooh nice. Looking at the change logs, looks like they added Vulkan acceleration back in September. Probably not as good as CUDA/Metal on supported hardware though.

        • ɐɥO
          link
          fedilink
          English
          19 months ago

          getting around 44 iterations/s (or whatever that means) on my gpu