Post reply

Name
Email
Subject
Message icon

Other options

Return to this topic
Don't use smileys

Verification:

Please leave this box empty:

Shortcuts: ALT+S post or ALT+P preview

Topic summary

Posted by 48 GB VRAM GPU when

- Today at 14:47:29

The VRAM amount per $ is good, fair enough, but:

Why not use GDDR7, it offers 3 GB per chip density, then this GPU, using the same chip (well, the chip would need to have GDDR7 PHY, of course), could have 48 GB VRAM:
256-bit/32-bit per chip = 8 chips, 8 chips * 3 GB per chip = 24 GB VRAM, 24 GB VRAM * 2 (chips on both sides of the PCB) = 48 GB VRAM and the memory bandwidth would also be 30% higher, because it's GDDR7 and not GDDR6.
Alternative calculation: 32 GB VRAM * 1.5 (3 GB per chip, instead of the current 2 GB) = 48 GB VRAM.
Frankly, when it comes to AI/LLMs, I'm not interested in 32 GB VRAM GPUs..and I said this over 1 year ago.
Let's see if NVIDIA releases a consumer 48 GB VRAM GPU in the RTX 60 series (probably not, but NVIDIA released a RTX PRO 6000 Blackwell GPU with 96 GB VRAM, which I also didn't expect).

Quoteaimed at AI LLM training / inference without providing specific performance info in the initial press release.

(en.wikipedia.org/wiki/GeForce_RTX_50_series#Desktop, en.wikipedia.org/wiki/Intel_Arc#Workstation_2)

RTX 5090: 1792 GB/s = 512-bit * 28 Gb/s / 8.
B70 Pro / B65 Pro: 608 GB/s = 256-bit * 19 Gb/s / 8.

So a consumer 5090 has a 3 times faster memory bandwidth and CUDA, of course. The big thing these B70/B65 could have had for them is having 48 GB VRAM.

For inferencing these are probably fine (if already supported by the usual inferening apps, like llama.cpp) (I know NV works using Vulkan just fine (~20% slower than using CUDA)), but then again, a 5090 has 3x the token generation and also much higher prompt processing.

Posted by Redaktion

- Today at 14:09:32

Intel published slides comparing the Arc Pro B70 GPU with Nvidia's RTX 4000 Pro. The Intel card appears to have up to 2.2x larger context windows thanks to its 33% larger VRAM capacity, which leads to 2x tokens per dollar performance increase and up to 6.2x faster responses in multi-agent or multi-user workloads.

https://www.notebookcheck.net/Intel-Arc-Pro-B70-perfomrance-metrics-significantly-faster-than-Nvidia-s-RTX-4000-Pro-at-half-the-price-and-33-more-VRAM.1259566.0.html

News:

Post reply

Topic summary

Posted by 48 GB VRAM GPU when

Posted by Redaktion