This mini-PC doesn't really deserve any "AI" and/or "Pro" in its product name:AI, both for traning and for the endconsumer/inferencing requires these 3 things:
1. Memory size, to fit a decently capable LLM.32 GB RAM, and no additional dGPU/VRAM to offload to, may not enough for the new SOTA LLM, again, especially in agentic workflows: Qwen3.6-35B-A3B-UD-Q4_K_M:
Quote from: reddit.com/r/LocalLLaMA/comments/1sq94qx/is_anyone_getting_real_coding_work_done_with.. I've come to the conclusion that (1) 32768 is the biggest context I can get away with in an adequately smart model, and (2) it just ain't enough.
The up to 128 GB RAM seem interesting at first, but are not enough to fit e.g. (decent quants, like Q8, of):
- MiMo-V2.5 (310 B)
- DeepSeek V4-Flash (284 B)
- MiniMax-M2.7 (229 B)
- Mistral Medium 3.5 (128 B)
- Qwen3.5-122B
Or any other quant of SOTA AI / LLMs in the 120 to 300 B parameter range. No wonder that there are rumors of a Strix Halo refresh/Medusa Halo with up to 192 GB or 256 GB RAM.
2. Memory speed, also known as bandwidth, relevant for token generation (output) speed.The memory speed is only 43526 MB/s (as only one RAM slot is populated), this is in line with any single-channel/64-bit/1*64-bit, PC/laptop/mini-PC, running at 5600 MT/s. This is not even dual-channel speeds, as can be seen in the table of this very review.
For comparison (memory bandwidth):
This (single-channel): 44.8 GB/s = 64-bit * 5600 MT/s / 1000 / 8.
Strix Halo: 256 GB/s * 8000 MT/s / 1000 / 8.
(yes, Strix Halo is 5.7 faster)
3. GPU compute, relevant for prompt processing (input) speed.Prompt processing: The larger the input, the faster the GPU you'd need, especially for agentic workflows. The prompt processing speed is based on GPU' FPS number in games/3D benchmarks. Here, the iGPU scores 1950 Points in 2560x1440 Time Spy Graphics. For comparison, a 5070 desktop: 3dmark.com/search: "Average score: 20330".
(The number of CPU threads doesn't matter for running AI (aka inferencing) (4 threads pretty much tops-out a dual-channel PC))-> As such, this mini-PC is also too slow to run Qwen3.6-27B at decent prompt processing (because of slow iGPU) and token generation speeds (because of slow memory bandwidth), even if the LLM fits.
-> So, in short for those who don't know: For a similar price you can build a desktop/mini-ITX/similarly sized PC with same or more RAM and it will be upgradable, repairable, have a much faster GPU, run quieter, cooler and therefore probably also last longer.
For the mentioned SOTA LLM: Has a notebook/(mini-)PC/device a total memory size of ~48 GB RAM+VRAM (32 GB RAM + 8 GB VRAM may also be enough), then there would be enough memory/context for agentic workflows. The 32k context (according to that reddit quote) that 32 GB RAM [and no dGPU] allows, may also not be enough for non-agentic workflows. This total of 32 GB RAM+VRAM issue weights by far the most, as running something slower (=no dGPU's much faster VRAM to offload to and much faster prompt processing that a dGPU would allow) is better than not being able to fit and run a LLM at all.