QuoteWell, you will run LLM on NPU. Not iGPU lmao.
It's the other way around. Big, popular, SOTA LLMs (that you can run using llama.cpp' WebUI and download the models straight from huggingface.co) are not run by the NPU at all, and it's not even supported on Linux e.g., but on GPU (/ iGPU) and memory. A NPU indeed requires specialized LLMs (like the ones built-in Windows 11), which makes a NPU almost useless so far (not saying that it may become a popular, power efficient, accelerator).
A NPU is never mentioned if you read the comments, it's all about memory size, memory bandwidth and the usually, out of it, resulting GPU performance, that are the key hardware requirements (of course, software support, like CUDA, mainly for training, too).
Sorry, but maybe you want to cope that you bought a more expensive and in this sense unnecessary Ryzen AI platform. Unless you can prove that a NPU is worth it using benchmarks (pp, tg, power efficiency and LLM performance).
Faster in what? prompt processing or token generation. Do a benchmark from like 1k to 64k (if it fits) of NPU vs iGPU. Maybe a NPU can be added with the iGPU, still, need to see where a NPU is actually useful, other than blurring the webcam background or fake where the eyes are looking at, more power efficiently (power efficiently matters a lot, but these use-cases are rather niche so far).