AMD, make a 384-bit GPU (like a 4090 24 GB is one), with GDDR7 and put the GDDR7 memory chips on both sides of the PCB (aka VRAM clamshell design). A 1 TB/s, 48 GB VRAM, consumer GPU will be very interesting for hosting LLMs privately and locally. llama.cpp Vulkan build run LLMs just fine, CUDA is not needed for endconsumer AI inferenceing.
QuoteMLID suggests that AMD might not want to lose this advantage by launching an RX 9080 XT with 32 GB of GDDR7 VRAM.
But AI folks want GDDR7 for higher memory bandwidth per same bit bus width. Don't waste the GPU silicon chip by connecting slower GDDR6 memory chips to it.
Is this rumored 32 GB card using a 256-bit GPU chip (= natively 16 GB VRAM) and is doubling the VRAM by using clamshell design? At least the name jump may suggest so? Otherwise it would be called RX 9090 XT. Or is that a native 32 GB card (= uses a 512-bit GPU chip, just like a RTX 5090 one is)?
Quote from: Terror Byte on Today at 08:20:05Specs make no sesne. 32GB would target 5090.
Yes, there is: Running AI/LLMs locally and privately. Also, the obvious thing is that a 5090 exists, people buy it and AMD is losing on those sales.