Apple M5 Pro & M5 Max GPU Analysis - M5 Max GPU on par with the GeForce RTX 5070 and faster than Strix Halo

Redaktion · March 10, 2026, 00:46:33

Apple launches their new M5 Pro and M5 Max SoCs with new GPU models. We test the 20-core M5 Pro GPU as well as the flagship 40-core M5 Max GPU in synthetic benchmarks as well as gaming tests and also check the efficiency compared to previous Apple GPUs as well as Nvidia's Blackwell graphics cards.

https://www.notebookcheck.net/Apple-M5-Pro-M5-Max-GPU-Analysis-M5-Max-GPU-on-par-with-the-GeForce-RTX-5070-and-faster-than-Strix-Halo.1246060.0.html

dada_dave · March 10, 2026, 01:37:08

Just FYI Blender 3.3 does not use hw-accelerated ray tracing on Apple Silicon. That didn't come in until Blender 4.2 I think. Though it doesn't explain why the M5 Max did worse than the M4 Max here, it might explain why CB 2024 GPU shows the more expected results.

joneskind · March 10, 2026, 04:11:19

Quote from: dada_dave on March 10, 2026, 01:37:08Just FYI Blender 3.3 does not use hw-accelerated ray tracing on Apple Silicon. That didn't come in until Blender 4.2 I think. Though it doesn't explain why the M5 Max did worse than the M4 Max here, it might explain why CB 2024 GPU shows the more expected results.

Even worse. Blender has been using HW-accelerated ray tracing on Nvidia cards since 3.3.

M5 Max 40 should be has powerful as the RTX 5090 laptop in Blender, if not even more, in Blender 4.5.

See opendata.blender

PumpkinFury · March 10, 2026, 08:18:53

My Strix Scar 18 5090 Laptop (175w) hit 6,752 in Steel Nomad.

Detechtive · March 10, 2026, 09:59:19

Quote from: joneskind on March 10, 2026, 04:11:19
Quote from: dada_dave on March 10, 2026, 01:37:08Just FYI Blender 3.3 does not use hw-accelerated ray tracing on Apple Silicon. That didn't come in until Blender 4.2 I think. Though it doesn't explain why the M5 Max did worse than the M4 Max here, it might explain why CB 2024 GPU shows the more expected results.

Even worse. Blender has been using HW-accelerated ray tracing on Nvidia cards since 3.3.

M5 Max 40 should be has powerful as the RTX 5090 laptop in Blender, if not even more, in Blender 4.5.

See opendata.blender

I did check. And assuming that Apple GPU scales linearly from M5 to M5 Max, Apple's best GPU is still at least 1000+ points behind RTX 5090 laptop. And remember, the Nvidia chip is already more than a year old at this point. Apple-Metal, Nvidia-OptiX.

Doesn't line up · March 10, 2026, 12:44:29

41% power efficiency improvement over the M4 Max 40-Core GPU is very good, but in Cyberpunk 2077 / Ultra Preset (FSR off) it's only 8%? Doesn't line up.

pimpom · March 10, 2026, 13:06:29

Blender v3.3 does not support MetalRT you idiot.

davidm · March 10, 2026, 16:27:43

It seems this site still doesn't register it's not just about the memory type (LPDDR5x-9600), it's also and hugely about the interface width. Strix Halo has a 256bit interface, the Mac Pro and Max have 384 and 512bit interfaces. Meanwhile, typical PCs have a 128bit interface. That's where a lot of the performance comes from, and it's an area PCs are mostly not competing unless going to a huge power hungry server board. It's been this way for a while. Macs are a lot more expensive, but I believe Strix Halo is selling well even at a premium. I'd happily pay double for a 384bit+ interface RAM Thinkpad.

Large L2 Cache too · March 10, 2026, 16:44:37

Quote from: davidm on March 10, 2026, 16:27:43the Mac Pro and Max have 384 and 512bit interfaces

Do you've a source for this? Apple generally don't disclose such specifications.

I'm slightly confused myself because multiple reviews and sites are stating that the M5 Pro and M5 Max chip are exact the same chip this time?

So, if true, are they like artificially segmenting through software a 512 bit to only use 384 bit or are they hardware binning bad yields (thought 3nm was fairly mature process with very few defective chips?) to give the differing effective bandwidths?

dada_dave

Quote from: Large L2 Cache too on March 10, 2026, 16:44:37
Quote from: davidm on March 10, 2026, 16:27:43the Mac Pro and Max have 384 and 512bit interfaces

Do you've a source for this? Apple generally don't disclose such specifications.

I'm slightly confused myself because multiple reviews and sites are stating that the M5 Pro and M5 Max chip are exact the same chip this time?

So, if true, are they like artificially segmenting through software a 512 bit to only use 384 bit or are they hardware binning bad yields (thought 3nm was fairly mature process with very few defective chips?) to give the differing effective bandwidths?

No the Max and Pro chips have the same CPU die, but different GPU dies and have different memory bandwidths. Apple has in fact disclosed bandwidth information (as well as number of memory controllers in the past) and people are able to figure out what the number of memory controllers must be from the bandwidth and RAM type. Also they follow a pretty regular pattern and the M5 is similar to the M4 generation.

Alex

Thanks for the quick testing, I'm particularly grateful you include M5 Pro, as that's the version I'm personally interested in.

Would be great to see something like LM Studio testing, as this was one of the major improvements with these new chips and M5 was indeed noticeably better at it than the previous generation.

Note that you want to report prefill number and tokens/second. For best comparison, you can directly use any GGUF model for Nvidia/AMD and MLX models for Apple, as the latter introduces some additional special optimizations for Apple's GPUs. Nvidia's CUDA is quite well optimized with pretty much any GGUF version.

No need to really testLLM

@Alex
For 3rd party LLM prompt processing (pp) speed only the iGPU gaming performance matters.
-> So, if M5' iGPU perf increased by "up to 30%"² vs M4, then the 3rd party AI/LLM prompt processing speed will also increase by roughly the same amount.

For LLM token generation (tg), only the memory bandwidth matters (assuming one has enough unified memory to fit a LLM in the first place (so, the memory size matters too)).
-> So, if the memory bandwidth increased by 12.5% (=307.2 GB/s (M5 Pro)/273 GB/s (M4 Pro))¹, then the AI/LLM token generation speed will also increase by roughly the same amount.

¹ en.wikipedia.org/wiki/MacBook_Pro_(Apple_silicon)
² en.wikipedia.org/wiki/Apple_M5#Performance

APPLE's claim of

Quote from: en.wikipedia.org/wiki/Apple_M5#PerformancePeak GPU AI compute: over 4× faster

is going to be (or rather: will be) relevant only to their own, tightly integrated, aka 1st party, solutions, not 3rd party LLMs. But, ofc, I have nothing against if the 3rd party LLM pp and tg speeds are tested.

News:

Apple M5 Pro & M5 Max GPU Analysis - M5 Max GPU on par with the GeForce RTX 5070 and faster than Strix Halo

Redaktion

dada_dave

joneskind

PumpkinFury

Detechtive

Doesn't line up

pimpom

davidm

Large L2 Cache too

dada_dave

Alex

No need to really testLLM

Quick Reply