Quote from: 48gb vram on Yesterday at 01:43:4748gb vram? So 2 rtx 5090m ?
In terms of pure allocate-able memory, yes, may even be slightly more than 2 rtx 5090m (because the OS is in the other 16 GB of memory).
In terms of FPS, it's a Strix Halo APU, so it's comparable to a 4060 Mobile:
3dmark.com/search - Steel Nomad:
Radeon 8060S: Average score: 2031
RTX 5050 (notebook): Average score: 2124
RTX 4060 (notebook): Average score: 2262
In terms of memory bandwidth:
Strix Halo: 256 GB/s (=256-bit * 8000MT/s / 1000 / 8)
RTX 5050 (notebook): 384 GB/s (en.wikipedia.org/wiki/GeForce_RTX_50_series#Mobile)
AMD advertises Strix Halo mainly for AI, and while a 128 GB RAM Strix Halo will fit a bigger LLM, its prompt processing speed will be that of a 4060 (notebook)/5050 (notebook) and its token generation speed will be that of 4060 Laptop (256 GB/s).
AI, both for training and inferencing for the endconsumer requires these 3 things:
1. Memory size, to fit a decently capable LLM.2. Memory speed, also known as bandwidth, relevant for token generation (output) speed.3. GPU compute, relevant for prompt processing (input) speed.Prompt processing: The larger the input, the faster the GPU you'd need, especially for agentic workflows. The prompt processing speed is based on GPU' FPS number in games/3D benchmarks.
(The number of CPU threads doesn't matter for running AI (aka inferencing) (4 threads pretty much tops-out a dual-channel PC))