- cross-posted to:
- machinelearning
- [email protected]
- cross-posted to:
- machinelearning
- [email protected]
cross-posted from: https://lemmy.world/post/811496
Huge news for AMD fans and those who are hoping to see a real* open alternative to CUDA that isn’t OpenCL!
*: Intel doesn’t count, they still have to get their shit together in rendering things correctly with their GPUs.
We plan to expand ROCm support from the currently supported AMD RDNA 2 workstation GPUs: the Radeon Pro v620 and w6800 to select AMD RDNA 3 workstation and consumer GPUs. Formal support for RDNA 3-based GPUs on Linux is planned to begin rolling out this fall, starting with the 48GB Radeon PRO W7900 and the 24GB Radeon RX 7900 XTX, with additional cards and expanded capabilities to be released over time.
And even when you do, you’re going to find it infinitely more productive (and also performant!) to use OpenAI’s Triton, or something like Tiramisu or Halide to implement custom fused matrix multiplies or convolutions. I honestly believe CUDA as a distinctive advantage of Nvidia GPUs has plateaued here.
NVidia has a few nifty tricks still. Their sparse matrix multiply allows for a 4x4 matrix multiplication with half the space (assuming that half the matrix has zeros in them, which is common in AI). I don’t think AMD has a sparse FP16 4x4 matrix multiplication instruction yet.
AMD is behind in AI, but not significantly. AMD is ahead in double-precision / 64-bit compute, by a wide measure. AMD is first-blood on chiplets with MI200 as well, which puts them in a strong boat for future innovation.