QuoteThe ECC memory is essential here, as a single bit flip could ruin a long render or an AI training run.
This GPU is not capable to train a LLM big enough where ECC would matter (even with several and days run, it is unlikely that something would happen). Neither is ECC necessary for fine-tuning with this GPU. Also, there are checkpoints, one doesn't lose everything, and even if, on this GPU the costs of losing a run are a few bucks a few days..in this ballpark. How I know? I asked the big LLMs over at arena.ai (aka lmarena.ai).
(en.wikipedia.org/wiki/GeForce_RTX_50_series#Desktop, en.wikipedia.org/wiki/Intel_Arc#Workstation_2)
- RTX 5090: 1792 GB/s = 512-bit * 28 Gb/s / 8.
- B70 Pro / B65 Pro: 608 GB/s = 256-bit * 19 Gb/s / 8.
So a consumer 5090 has a 3 times faster memory bandwidth and CUDA, of course. The only thing these B70/B65 could have go for them is having 48 GB VRAM and people trying to make them work without CUDA.
For inferencing these are probably fine (I know NV works using Vulkan just fine (a bit slower than using CUDA)), but then again, a 5090 has 3x the token generation and also much higher prompt processing.