How to get started? A number of questions ....

@Solvena · 2 years ago

How to get started? A number of questions ....

ffhein · 2 years ago

If you’re using llama.cpp it can split the work between GPU and CPU, which allows you to run larger models if you sacrifice a little bit of speed. I also have 12 GB vram and I’m mostly playing around with llama-2-13b-chat. llama.cpp more of a library than a program, but it does come with a simple terminal program to test things out. However many GUI/web programs use llama.cpp so I expect them to be able to do the same.

As for GUI programs I’ve seen gpt4all, kobold and silly tavern, but I never got any of them to run in docker with GPU acceleration.