How does Lemmy feel about "open source" machine learning, akin to the Fediverse vs Social Media?

@brucethemoose · edit-2 6 months ago

How does Lemmy feel about "open source" machine learning, akin to the Fediverse vs Social Media?

@[email protected] · 6 months ago

I love the idea, I much prefer it to the mainstream. The problem is, the typical process of documenting FOSS and self-host projects (websites, wiki, mailing lists, etc) move too slow and are too cumbersome for how quick things are developing right now. So people are kind of having to invent the new tech a d new ways to communicate about it, and they’re not always making choices that either scale or are easy to find and reference.

Okay, since you seem to be so helpful here, I’ll lay out where I’m at. I’ve been using LLMs like ChatGPT, Copilot, and Bard more professionally. I find them equal parts useful, confusing, annoying, and skeevey. I’ve got a lil VPS I run for services, I could put a front end on there easy. I’ve also got an old 8core Xeon machine with like 48GB ram and a leftover AMD R9 270 sitting there with Unraid barely installed. I can chamge the OS of course, but what am I realistically looking at being able to run locally that won’t go above like 60-75% usage so I can still eventually get a couple game servers, network storage, and Jellyfin working? I’ll be honest I don’t care about image generation much, but if I do I can always look into upgrading

@brucethemoose · edit-2 6 months ago

but what am I realistically looking at being able to run locally that won’t go above like 60-75% usage so I can still eventually get a couple game servers, network storage, and Jellyfin working?

Honestly, not much. Llama 8B, but very slowly, or maybe deepseek v2 chat, preprocessed on the 270 with vulkan but mostly running on CPU. And I guess just limit it to 6 threads? I’d host it with kobold.cpp vulkan, or maybe the llama.cpp server if there will be multiple users.

You can try them to see if they feel OK, but llms are just not something that like old hardware. An RTX 3060 (or a Mac, or a 12GB+ AMD GPU) is considered bare minimum in the community, a 3090 or 7900 XTX standard.