Quotea 384-bit LPDDR6 or a 256-bit LPDDR5X memory controller.
(AMD's Strix Halo has a 256-bit wide memory interface and supports up to 128 GB RAM. Strix Halo: 8000 MT/s * 256-bit / 1000 / 8 = 256 GB/s.)
AMD specifically advertises Strix Halo for running AI / LLMs and if the memory density wouldn't change with LPDDR5X in 2027+, then the 256-bit rumor does only make sense, if the "Halo" word is defined to stay at 256-bit / at basically the same performance as Strix Halo (just like the "Point" word in e.g. Phoenix Point indicates the same tier). This [using LPDDR5X] would make Medusa Halo a refresh/rename of Strix Halo, just like Phoenix Point to Hawk Point. Hawk Point came one year after Phoenix Point, but Medusa Halo would come out 2+ years after Strix Halo and to be at the same tier / performance and memory size (let's say it will be 10000 MT/s vs 8000 MT/s, but still at the same up to 128 GB RAM), is almost unexpected?
In 2027+, what would really be needed, is at least a 384-bit wide chip (preferable a 512-bit one): All the big and smart generalizer MoE LLMs require at least a 384-bit wide memory interface and the 192 GB RAM that would come with it to fit into the RAM.
If DDR6 / LPDDR6 would (almost) double the memory bandwidth and double the memory density at the same bit width, then a 256-bit width would be enough:
14400 MT/s * 256-bit / 1000 / 8 = 460.8 GB/s and up to 256 GB RAM. Improvements in the MoE LLM architecture, would also reduce the requirement for the memory bandwidth: 8000 MT/s vs 14400 MT/s is not double, but then it wouldn't need to be either. But the doubling in memory size would be required, as this is how MoE scales (fewer active parameters (=less memory bandwidth required), but more in size to compensate for it).
[1]:
- huggingface.co/unsloth/GLM-4.7-GGUF (the Q3_K_M quant is 171 GB, so one would at least need 171 GB RAM + additional 10-20 GB for the OS and the context)
- huggingface.co/unsloth/MiniMax-M2.1-GGUF (at least a Q4_K_M quant (138 GB) is recommended according to the latest user comments)
- huggingface.co/unsloth/MiMo-V2-Flash-GGUF (Q3_K_M (147 GB) to Q4_K_M (187 GB))