Gamers Nexus went over some available numbers and the 9070 is small scale compared to others. It is also massively overpriced relative to MSRP numbers. I wouldn’t touch anything under 16GB now if you want to play around with AI more seriously. I have 16 and it is capable, but I really wish I had more and can only barely run the ones that are both good enough to be broadly useful and have source I can hack around with a lot. I almost got a machine with a 12gb GPU and am very happy I did not.
It’s also depends on your use cases. I have 10 GB VRAM and local LLMs work ok for spell/style checking, idea generation and name generation (naming planert clusters thematically in Star Ruler 2).
I get functional code snippets for around 3 of 4 questions in any major or most minor languages from a local model. I also get good summarized information about code functionality if I paste up to around 1k lines into the context. I also get fun collaborative story writing in different formats using unique themes from my own science fiction universe. I have explored smaller models in hopes of fine tuning before I discovered the utility of a much larger but quantized model. I never use anything smaller than a 70B or 8×7B because there is no real comparison in my experience and uses. On my hardware, these generate a text stream close to my reading pace.
Gamers Nexus went over some available numbers and the 9070 is small scale compared to others. It is also massively overpriced relative to MSRP numbers. I wouldn’t touch anything under 16GB now if you want to play around with AI more seriously. I have 16 and it is capable, but I really wish I had more and can only barely run the ones that are both good enough to be broadly useful and have source I can hack around with a lot. I almost got a machine with a 12gb GPU and am very happy I did not.
It’s also depends on your use cases. I have 10 GB VRAM and local LLMs work ok for spell/style checking, idea generation and name generation (naming planert clusters thematically in Star Ruler 2).
I get functional code snippets for around 3 of 4 questions in any major or most minor languages from a local model. I also get good summarized information about code functionality if I paste up to around 1k lines into the context. I also get fun collaborative story writing in different formats using unique themes from my own science fiction universe. I have explored smaller models in hopes of fine tuning before I discovered the utility of a much larger but quantized model. I never use anything smaller than a 70B or 8×7B because there is no real comparison in my experience and uses. On my hardware, these generate a text stream close to my reading pace.
Image and video generation is where you see that VRAM bottleneck. I have a 12GB 4070 and it cannot generate any video despite my best efforts/tweaks.