Your Lemmy Crash Course to Free Open-Source AI

@Blaed · edit-2 1 year ago

Your Lemmy Crash Course to Free Open-Source AI

@[email protected] · 2 years ago

This is super helpful, thanks a lot! Just what I was looking for!

Shelldor · 2 years ago

Glad to have found this. I just discovered all this a few days ago and have been hooked on ai since it is amazing

@Blaed · edit-2 2 years ago

Greetings from across the Fediverse! We’re happy to have you. I’m excited to see what our future holds. If you want to learn more about AI, consider checking out UnderstandGPT and some of the other partner communities on the sidebar!

db0 · edit-2 2 years ago

Can you please add a section mentioning the AI Horde and the ecosystem of software around it?

@[email protected] · 2 years ago

Do you have any good resources on tips/tricks for using Stable Diffusion effectively? I’ve got it set up running in a docker container with AUTOMATIC111’s UI, running on my RX 6700xt. I’ve played around for a few hours, but am looking for some good prompt tips, insights into which models are better, what each tool does.

Looking up info on this stuff is kind of hit or miss as to the quality of it, it seems so far. Or it’s intended for an older version or something similar.

@Blaed · 2 years ago

Have you had a chance to try ControlNet? There are some some really cool workflows with this.

Stable Diffusion + ControlNet gives you a sense of ‘manual’ control over your AUTOMATIC1111/web-ui workflow.

I quite like it, and I think there’s a ton of potential if you combine it with other processing techniques. Prompts are fickle things, there is no one-size fits all, but hopefully these help you adjust and tune your prompts as you play around with the toolset.

If you’re looking for models, civitai is a good place to try out new styles, checkpoints, LoRAs, and other downloadable content you can use to enhance your Diffusion Suite.

If you’re not sure what ControlNet is, try starting with this video here which goes over the workflow I personally use for some of my own projects.

In the case you’re looking to break into generative video content (based on similar stable diffusion web-ui workflows), you’ll want to check out TemporalKit + EbSynth (which is also detailed in the FOSAI Nexus)!

@elghoto · 2 years ago

Beyond the tools you are showing here. What type of hardware do you have at home to run most of these?

@Blaed · edit-2 2 years ago

Great question. I suggest visiting UnderstandGPT for the full table, but here’s a brief breakdown of current home GPU/VRAM recommendations (as of June 2023):

Model Size

7B

```
   Required VRAM (4bit): 6GB
```
```
   Required VRAM (8bit): 10GB
```

   Recommended GPU (4bit): GTX 1660, 2060, AMD 5700 XT, RTX 3050, 3060

13B

```
   Required VRAM (4bit): 10GB
```
```
   Required VRAM (8bit): 20GB
```

   Recommended GPU (4bit): AMD 6900 XT, RTX 2060 12GB, 3060 12GB, 3080, A2000

30B

```
   Required VRAM (4bit): 20GB
```
```
   Required VRAM (8bit): 40GB
```

   Recommended GPU (4bit): RTX 3080 20GB, A4500, A5000, 3090, 4090, 6000, Tesla V100

65B

```
   Required VRAM (4bit): 40GB
```
```
   Required VRAM (8bit): 80GB
```

   Recommended GPU (4bit): A100 40GB, 2x3090, 2x4090, A40, RTX A6000, 8000

In terms of CPU requirements, you can run inference on all sorts of hardware. There are people who have been able to run AI/LLM models on their laptops and others with little to no GPU whatsoever. Although for best results, a strong GPU will be important. CUDA cores (NVIDIA specific hardware on all 2xxx/3xxx/4xxx series graphics cards) utilize advanced acceleration algorithms that significantly boost AI performance. An important detail to keep in mind for anyone wanting to run models on NVIDIA cards. You get a boost in that regard.

For my fellow gamers - NVIDIA CUDA cores are what help process your in-game DLSS and RTX (among many other components), settings I’m sure you’ve explored turning on or off to boost your FPS. This is the same tech that gives you an advantage running AI on an NVIDIA GPU at home.

There is not yet an AMD equivalent to CUDA cores, but they have recently partnered with HuggingFace to explore how to offer more competition in this space.

Storage is up to you. Make sure you read file sizes before downloading. I learned that the hard way. Some of these file sizes can easily blow up your hard drive space. Consider dedicating a disk or large folder for all of your AI tinkering and workloads. I personally dedicate a 1TB drive that I use to archive the many models that I experiment with, but it’s overkill for most. You could get away with 128GB/250GB/500GB of storage if you stayed organized. If you plan to only run the small models, 8GB - 24GB should be plenty of room.

For RAM, it’s suggested to have 16GB+, but it’s not as important as GPU + CPU power (a compute combination possible for GGML models - a popular format that let’s you combine the power of both your graphics card and your processor). It’s worth noting RAM might help you load and unload models a little faster, especially so for the larger parameter variants - but the your CPU & GPU are far more important at the moment. In my opinion, 32GB/64GB of RAM is the sweet spot.

If you don’t have access to powerful GPUs you should check out runpod.io and vast.ai. They are great cloud compute platforms that allow you to rent-a-gpu for relatively cheap (typically for a few bucks an hour). Worth looking into if you want to tinker with the larger models, but there are many ways to get access to those whether renting a GPU or trying to get GGML / Quantized model running on your local hardware at home - which is 100% doable if you have at least a 1660 (or newer) graphics card. I haven’t had a lot of time to interact with AMD benchmarks so I’d love to hear how it goes for anyone running one of those cards. I’ll be doing a thorough bench later this month once I finish setting up the server.

What’s great about all of this is that compute for AI is going down for consumer hardware in general. I wouldn’t be surprised if we started to see people running models that have 65B+ parameters somewhat casually before the end of the year.

@abhibeckert · edit-2 2 years ago

I get great performance with Stable Diffusion (Automatic1111) on my entry level M1 MacBook Air (which, if you don’t know Macs, is an old model of an entry level laptop).

In particular Apple’s modern GPUs have a lot of memory even at the low end.

It generally takes about 20 seconds or so to generate an image (and again, this is a low end fanless ultraportable laptop…)

ffhein · 2 years ago

Text/conversation generation works surprisingly well with CPU only. And it’s possible to split the work between GPU and CPU to achieve a significant speed-up even if you don’t have enough VRAM to fit the whole model. If you don’t have a very powerful CPU you might still be able to get good results with a 7B model, and I think I’ve seen 3B models which I assume require even less resources.

Haven’t played around with stable diffusion in some time, but unless things have changed GPU computing power is much more important for this. When I tried it you needed to be able to fit the entire model in VRAM but maybe it’s possible to split it nowadays. Generating a 512*512 image with my old GTX1080 took about half a minute, but it went down to a few seconds after upgrading to an RTX3080. Exact time requirement will of course depend on which settings you use.

Scew · 2 years ago

You should make a more advanced version too. Comfy-ui is wonderful but also has a bigger learning curve.

@Blaed · 2 years ago

Great suggestion! I’ll add that to the list.

For anyone unsure what this is, here’s a quick peek at Comfy-ui.

Kind of similar to ControlNet, it’s another method for expanding the options you have over your Stable Diffusion workflow.

ffhein · 2 years ago

I think several of the listed projects builds on llama.cpp, though it’s also possible to run it directly. The example programs they provide seem a little more bare-bones than other projects, e.g. their “main” program runs in a terminal rather than a fancy web UI. Personally I found llama.cpp running in docker an easy way to get GPU acceleration, since Cuda Toolkit doesn’t support Fedora 38, but I’m just getting started with personal GPTs so I haven’t explored all the options.

Your Lemmy Crash Course to Free Open-Source AI

Your Lemmy Crash Course to Free Open-Source AI

Join the AI Horde!