News:

Willkommen im Notebookcheck.com Forum! Hier können sie über alle unsere Artikel und allgemein über Notebook relevante Dinge disuktieren. Viel Spass!

Main Menu

AMD Ryzen AI Max+ 395 Analysis - Strix Halo to rival Apple M4 Pro/Max with 16 Zen 5 cores and iGPU on par with RTX 4070 Laptop

Started by Redaktion, February 18, 2025, 15:01:32

Previous topic - Next topic

Redaktion

The new Ryzen AI Max+ 395 is AMD's latest high-end mobile processor. With up to 16 Zen 5 cores, a powerful Radeon GPU, fast NPU and up to 128 GB RAM, the Ryzen AI Max+ is supposed to be the ideal companion for gaming, content creation, and AI development.

https://www.notebookcheck.net/AMD-Ryzen-AI-Max-395-Analysis-Strix-Halo-to-rival-Apple-M4-Pro-Max-with-16-Zen-5-cores-and-iGPU-on-par-with-RTX-4070-Laptop.963274.0.html

Donkey545

While the standard suite of benchmarks is appreciated in this review, these performance metrics are largely irrelevant to the target audience of a product like this. The AI series chips, with high bandwidth, uniform memory architecture, are targeted at LLM inference users. A valuable benchmark for these users could be the running of various LLMs in verbose mode to check the tokens/s. The advantage of this product is that it can fit massive models in memory compared to even the highest end dGPUs. A great comparison for this product segment would be to compare the performance of Llama3.3 70b using ollama on CPU, and  GPU (using ROCm) to the M4 series hardware from apple.

Kravis

I know, right? Give us tok/sec for various llms with various level of quantization. People are not scooping 4090s and 5090s these day to play games with highest fps, they are buying them to run local AI.

Yeshy

For the "Power Consumption / Cyberpunk 2077 ultra Efficiency", do you / could you do a version that combines the CPU and GPU power?

If lets say 4070 60W = 8060S 60W, that's great, but it's ignoring that the 4070 has a CPU to power alongside it

I don't know what would be a fair way to test, besides making curves; comparing 100W 4070 is "unfair" since you get diminishing returns as you approach 100W on it

Maybe just 1080p60 Medium 60fps limit? Or just test different TDP limits, but it would be arbitrary

Callum

I would love to see a CFD test run on all of these, or at least a meshing process. This is a great general test of many things and overall performance of motherboards, CPUs, gpus and memory. Along with added LLM testing...

Alpha_Lyrae

Quote from: Yeshy on February 18, 2025, 23:15:26For the "Power Consumption / Cyberpunk 2077 ultra Efficiency", do you / could you do a version that combines the CPU and GPU power?

If lets say 4070 60W = 8060S 60W, that's great, but it's ignoring that the 4070 has a CPU to power alongside it

I don't know what would be a fair way to test, besides making curves; comparing 100W 4070 is "unfair" since you get diminishing returns as you approach 100W on it

Maybe just 1080p60 Medium 60fps limit? Or just test different TDP limits, but it would be arbitrary

Yeah, total system power should be used when comparing between APUs/SoCs and CPU+dGPUs. You'll find that power consumption is much higher in discrete hardware simply by design: having two chips, CPU and GPU, and two sets of memory, LPDDR5/DDR5 and GDDR6, and more VRM MOSFETs to provide power.

A

Aren't LLMs broken under windows and AMD due to crappy MS DirectML? So it may not get that great LLM results unless you load up Linux. That is assuming that amdgpu supports it. Then there is the fact that a lot of the software and libraries out there aren't going to be using the NPU to assist

Papajon

"Gets destroyed by last years QC CPU in MC efficiency and SC performance."

Seems like a great chip in portable devices.

Papajon

Quote from: Donkey545 on February 18, 2025, 17:01:45While the standard suite of benchmarks is appreciated in this review, these performance metrics are largely irrelevant to the target audience of a product like this. The AI series chips, with high bandwidth, uniform memory architecture, are targeted at LLM inference users. A valuable benchmark for these users could be the running of various LLMs in verbose mode to check the tokens/s. The advantage of this product is that it can fit massive models in memory compared to even the highest end dGPUs. A great comparison for this product segment would be to compare the performance of Llama3.3 70b using ollama on CPU, and  GPU (using ROCm) to the M4 series hardware from apple.

If it's for AI users, why is it a laptop chip ?

LL

Notebookcheck should test it with Unreal Engine where GPU memory is crucial and compare it with Nvidia options. 

It would also be a speed test of GPU memory vs RAM memory for GPU use in a practical situation. Does it matter?


davidm

People want to use "AI" on their notebooks too, for always running local assistance.

The review plays a shell game, they should just talk about memory bandwidth rather than changing things around in the headline, screenshots, text, etc. As others have said, memory bandwidth is the main thing that matters for LLMs, the NPU is not really used for LLMs, it's GPU cores and memory bandwidth. Mixture of Expert models may become more popular, they can run ok on slower memory, but x86 still needs something at least as fast as Apple Max chips.

GERMAN_MEN

The Asus ROG Flow Z13 GZ302 is a true marvel that brings together what every company needs: mobility and performance.
My next laptop or tablet, and I don't care, but the most important thing and what 99% of users are looking for is an APU with the perfect balance between CPU + iGPU and this is what the AMD Ryzen AI Max + 395 Ryzen AI Max 390 Ryzen AI Max 385 Ryzen 9 AI Max 380 processors have.

CJ

Quote from: Papajon on February 19, 2025, 11:34:42
Quote from: Donkey545 on February 18, 2025, 17:01:45While the standard suite of benchmarks is appreciated in this review, these performance metrics are largely irrelevant to the target audience of a product like this. The AI series chips, with high bandwidth, uniform memory architecture, are targeted at LLM inference users. A valuable benchmark for these users could be the running of various LLMs in verbose mode to check the tokens/s. The advantage of this product is that it can fit massive models in memory compared to even the highest end dGPUs. A great comparison for this product segment would be to compare the performance of Llama3.3 70b using ollama on CPU, and  GPU (using ROCm) to the M4 series hardware from apple.

If it's for AI users, why is it a laptop chip ?

That question makes about as much sense as asking in an RTX4090 mobile review "If it's for gamers, why is it a laptop chip?". Sometimes, you might want to run a LMM offline. Sure, you could SSH into a server at home over a VPN or whatever, but maybe you just want a laptop that does everything and don't want a desktop. Also, a 128GB strix halo laptop might cost $3000, an 80GB H100 costs tens of thousands of dollars and requires a server to put it in.

Aldf

Hey, the efficiency per watt for the cb 2024 test does not add up, after you have added the update.

Quick Reply

Name:
Email:
Verification:
Please leave this box empty:

Shortcuts: ALT+S post or ALT+P preview