News:

Willkommen im Notebookcheck.com Forum! Hier können sie über alle unsere Artikel und allgemein über Notebook relevante Dinge disuktieren. Viel Spass!

Main Menu

Radeon RX 7900 XTX outperforms RTX 4090 and RTX 4080 Super In DeepSeek AI benchmark

Started by Redaktion, January 29, 2025, 18:26:39

Previous topic - Next topic

Redaktion

AMD has posted some first-party benchmarks of the Radeon RX 7900 XTX running a DeepSeek model locally. Initially, Team Red has a decent lead, which quickly fades once the number of parameters goes up.

https://www.notebookcheck.net/Radeon-RX-7900-XTX-outperforms-RTX-4090-and-RTX-4080-Super-In-DeepSeek-AI-benchmark.954208.0.html

zenit

I read this news in other media and I didn't believe it, so we'll have to wait for the new AMD and its RX 9070 graphics since according to leaks they have three times the tensor core for artificial intelligence and can be a cheap solution compared to the RTX 5070 and RTX 5080.
What surprises me is that an AMD laptop with Strix Halo Ryzen AI Max 395 can easily run artificial intelligence locally.

The guide confirms that the new Ryzen AI Max "Strix Halo" processors come in 32GB, 64GB, and 128GB wired LPCAMM2 memory configurations, and there will be no 16GB memory option so that laptop manufacturers can save. The guide goes on to explain that "Strix Halo" will be able to locally accelerate DeepSeek-R1-Distill-Llama with 70 billion parameters on the 64GB and 128GB memory configurations of "Strix Halo" laptops, while the 32GB model should be able to run DeepSeek-R1-Distill-Qwen-32B. Ryzen AI "Strix Point" mobile processors should be able to run DeepSeek-R1-Distill-Qwen-14B and DeepSeek-R1-Distill-Llama-14B on their RDNA 3.5 iGPUs and NPUs. Meanwhile, previous-generation processors based on "Phoenix Point" and "Hawk Point" chips should be able to run DeepSeek-R1-Distill-Llama-14B. The company recommends running all of the above distillates on the KM Q4 quantization.

davidm

Strix Halo will be able to run larger models using shared RAM, but a lot more slowly than if they were running in VRAM, to the degree they won't be pleasant to use for applications that require responsiveness.

Illrigger

Quote from: davidm on January 29, 2025, 21:19:53Strix Halo will be able to run larger models using shared RAM, but a lot more slowly than if they were running in VRAM, to the degree they won't be pleasant to use for applications that require responsiveness.
Maybe, but Halo a bit of a leg up, especially considering the quad channel 256-bit memory bus that gives it bandwidth similar to a 4060. It may not be able to churn out the same raw it/s of a 4090, but I can pretty much guarantee it will be as fast or faster than a single 4090 or 4090 when running a 70b model, which makes them swap out into system memory and slows them down to a couple it/s or less. And it will do so in a 150W power envelope, vs close to 1000.

Quick Reply

Warning: this topic has not been posted in for at least 120 days.
Unless you're sure you want to reply, please consider starting a new topic.

Name:
Email:
Verification:
Please leave this box empty:

Shortcuts: ALT+S post or ALT+P preview