The Strix Halo APU (= RTX 4060 Laptop/Mobile in performance) supports up to 128 GB RAM. Such a waste of APU to offer only 48 GB RAM and, AFAIK, only 75 % of the available RAM can be assigned to a LLM / AI, too. It could be: If it's 64 GB RAM * 0.75 = 48 GB RAM.
Calling it VRAM is ok, but it's not using GDDR6 or GDDR7 chips, but the much slower per bit LPDDR5X at 8000 MT/s on a 256-bit (=quad-channel, and this is AMD Threadripper territory) wide memory bus (256 GB/s = 256-bit * 8000 MT/s / 1000 / 8).
For comparison, a normal dual-channel desktop PC has a ~2.67 slower bandwidth of: 96 GB/s = 128-bit * 6000 MT/s / 1000 / 8.
If it's the full 48 GB VRAM (=64 GB * 0.75), then it's enough for Qwen3.6-35B-A3B-UD-Q4_K_M state of the art AI LLM for agentic workflows; if it's 48 GB * 0.75, then it may not be enough, depending of your workload:
Quote from: reddit.com/r/LocalLLaMA/comments/1sq94qx/is_anyone_getting_real_coding_work_done_with.. I've come to the conclusion that (1) 32768 is the biggest context I can get away with in an adequately smart model, and (2) it just ain't enough.