And regarding 96 GB @ 819.3 GB/s vs 128 GB @ 273 GB/s
Total memory size x speed value:
M3 Ultra: 78653 = 819.3 GB/s * 96 GB
GB10/Spark: 34944 = 273 GB/s * 128 GB
Offering only 96 GB is crazy low tho, and a waste of the huge 1024 bits APU potential (exactly like giving a RTX 5090 6 GB of VRAM (5.33 times lower: 512 GB -> 96 GB = 32 GB -> 6 GB), so low in fact, that it's going to push some users to get AMD Strix Halo or NVIDIA Spark, depending on their use case, especially considering the price.
Problem is, 96 GB @ 819.3 GB/s can fit and run dense models that outperform MoE models on the slower 128 GB RAM ;) and compensate hugely for the fact that it's not 128 GB RAM. If Qwen releases a dense e.g. 50B to 90B model, then the 819.3 GB/s start to shine even more. But we don't have to wait: Qwen3.6-27B dense is a good example.