Apple MacBook Air 15 M5 Review - Very powerful, fanless and without competition

juri · March 13, 2026, 03:09:11

12W average in idle?? how can that be, lunar lake is only a third of this.
why are you not doing the battery test with idle and video playback any more??

and still no matte option for the screen, so idiotic without any reason.
until then i wont consider an air.

How tastes differ · March 13, 2026, 09:35:10

Quote from: juri on March 13, 2026, 03:09:11and still no matte option for the screen, so idiotic without any reason.

It's not 12W, look again. Maybe the matte coating on your screen played a prank on you ;)
And I'm so glad that it's glossy (richer colors, sharper text). Mind that while it's glossy, the anti-reflective coating/ability is very good.

Interesting for reviewers · March 16, 2026, 11:07:12

This may be interesting for reviewers:

Quote from: youtube.com/watch?v=HKxIGgyeISMApple's Energy Model - Deconstructed

In this video, I reverse engineer Apple's Energy Model on the Mac Studio M4 Max. In the process I explain why and how measured DC power can appear up to 3 times higher than reported M4 Max GPU power.

[..]

only bandwidth = more tg · March 16, 2026, 12:04:31

As expected, a vid by Alex Ziskind (youtube.com/watch?v=XGe7ldwFLSE), in this case, proves that Apples's claim of

Quote from: en.wikipedia.org/wiki/Apple_M5#PerformancePeak GPU AI compute: over 4× faster

does not apply to running 3rd party LLMs and only the RAM/unified memory bandwidth increase increases the token generation (28% = 1.28 = 153.6 GB/s (M5) / 120 GB/s (M4)). (153.6 GB/s = 128-bit * 9600 MT/s / 1000 / 8)

only bandwidth = more tg · March 16, 2026, 12:05:46

Or look up the 153.6 GB/s and 120 GB/s values here: en.wikipedia.org/wiki/Apple_silicon#M-series_SoCs.

some llama.cpp benchmarks · April 10, 2026, 16:59:33

First I confirmed that the CPU and GPU scores of my new Air 15 M5 align with what is expected.

Here are some llama.cpp's llama-bench benchmarks:
First with battery saving mode on:

Code Select

/Users/../llama-b8740/llama-bench --no-warmup -m /Users/../Qwen3.5-9B-UD-Q4_K_XL.gguf -p 128 -n 256 -t 1,2,3,4
ggml_metal_device_init: testing tensor API for f16 support
ggml_metal_library_init_from_source: error compiling source
ggml_metal_device_init: - the tensor API is not supported in this environment - disabling
ggml_metal_library_init: using embedded metal library
ggml_metal_library_init: loaded in 0.013 sec
ggml_metal_rsets_init: creating a residency set collection (keep_alive = 180 s)
ggml_metal_device_init: GPU name:  MTL0
ggml_metal_device_init: GPU family: MTLGPUFamilyApple10  (1010)
ggml_metal_device_init: GPU family: MTLGPUFamilyCommon3 (3003)
ggml_metal_device_init: GPU family: MTLGPUFamilyMetal4  (5002)
ggml_metal_device_init: simdgroup reduction  = true
ggml_metal_device_init: simdgroup matrix mul. = true
ggml_metal_device_init: has unified memory    = true
ggml_metal_device_init: has bfloat            = true
ggml_metal_device_init: has tensor            = false
ggml_metal_device_init: use residency sets    = true
ggml_metal_device_init: use shared buffers    = true
ggml_metal_device_init: recommendedMaxWorkingSetSize  = 12713.12 MB
| model                          |      size |    params | backend    | threads |            test |                  t/s |
| ------------------------------ | ---------: | ---------: | ---------- | ------: | --------------: | -------------------: |
| qwen35 9B Q4_K - Medium        |  5.55 GiB |    8.95 B | MTL,BLAS  |      1 |          pp128 |        110.11 ± 4.18 |
| qwen35 9B Q4_K - Medium        |  5.55 GiB |    8.95 B | MTL,BLAS  |      1 |          tg256 |          9.50 ± 0.30 |
| qwen35 9B Q4_K - Medium        |  5.55 GiB |    8.95 B | MTL,BLAS  |      2 |          pp128 |        107.58 ± 6.53 |
| qwen35 9B Q4_K - Medium        |  5.55 GiB |    8.95 B | MTL,BLAS  |      2 |          tg256 |          9.45 ± 0.08 |
| qwen35 9B Q4_K - Medium        |  5.55 GiB |    8.95 B | MTL,BLAS  |      3 |          pp128 |        110.79 ± 1.14 |
| qwen35 9B Q4_K - Medium        |  5.55 GiB |    8.95 B | MTL,BLAS  |      3 |          tg256 |          9.29 ± 0.09 |
| qwen35 9B Q4_K - Medium        |  5.55 GiB |    8.95 B | MTL,BLAS  |      4 |          pp128 |        110.78 ± 1.42 |
| qwen35 9B Q4_K - Medium        |  5.55 GiB |    8.95 B | MTL,BLAS  |      4 |          tg256 |          8.75 ± 0.95 |

With the default no battery saving mode:

Code Select

/Users/../llama-b8740/llama-bench --no-warmup -m /Users/../Qwen3.5-9B-UD-Q4_K_XL.gguf -p 128 -n 256 -t 1,2,3,4
[..]
| model                          |      size |    params | backend    | threads |            test |                  t/s |
| ------------------------------ | ---------: | ---------: | ---------- | ------: | --------------: | -------------------: |
| qwen35 9B Q4_K - Medium        |  5.55 GiB |    8.95 B | MTL,BLAS  |      1 |          pp128 |        216.04 ± 8.47 |
| qwen35 9B Q4_K - Medium        |  5.55 GiB |    8.95 B | MTL,BLAS  |      1 |          tg256 |        20.17 ± 0.04 |
| qwen35 9B Q4_K - Medium        |  5.55 GiB |    8.95 B | MTL,BLAS  |      2 |          pp128 |        209.43 ± 0.35 |
| qwen35 9B Q4_K - Medium        |  5.55 GiB |    8.95 B | MTL,BLAS  |      2 |          tg256 |        20.21 ± 0.03 |
| qwen35 9B Q4_K - Medium        |  5.55 GiB |    8.95 B | MTL,BLAS  |      3 |          pp128 |        196.55 ± 5.79 |
| qwen35 9B Q4_K - Medium        |  5.55 GiB |    8.95 B | MTL,BLAS  |      3 |          tg256 |        18.34 ± 3.22 |
| qwen35 9B Q4_K - Medium        |  5.55 GiB |    8.95 B | MTL,BLAS  |      4 |          pp128 |        193.50 ± 2.13 |
| qwen35 9B Q4_K - Medium        |  5.55 GiB |    8.95 B | MTL,BLAS  |      4 |          tg256 |        19.57 ± 0.18 |

some llama.cpp benchmarks

On battery saving mode (for no battery saving mode, you guessed it, just like above, multiply by ~2):

Code Select

/Users/../llama-b8740/llama-bench --no-warmup -m /Users/../Qwen3.5-9B-UD-Q4_K_XL.gguf -p 64 -n 64 -t 1-12

| model                          |      size |    params | backend    | threads |            test |                  t/s |
| ------------------------------ | ---------: | ---------: | ---------- | ------: | --------------: | -------------------: |
| qwen35 9B Q4_K - Medium        |  5.55 GiB |    8.95 B | MTL,BLAS  |      1 |            pp64 |        107.20 ± 6.39 |
| qwen35 9B Q4_K - Medium        |  5.55 GiB |    8.95 B | MTL,BLAS  |      1 |            tg64 |          9.75 ± 0.06 |
| qwen35 9B Q4_K - Medium        |  5.55 GiB |    8.95 B | MTL,BLAS  |      2 |            pp64 |        109.29 ± 0.92 |
| qwen35 9B Q4_K - Medium        |  5.55 GiB |    8.95 B | MTL,BLAS  |      2 |            tg64 |          9.62 ± 0.01 |
| qwen35 9B Q4_K - Medium        |  5.55 GiB |    8.95 B | MTL,BLAS  |      3 |            pp64 |        109.05 ± 0.96 |
| qwen35 9B Q4_K - Medium        |  5.55 GiB |    8.95 B | MTL,BLAS  |      3 |            tg64 |          9.47 ± 0.00 |
| qwen35 9B Q4_K - Medium        |  5.55 GiB |    8.95 B | MTL,BLAS  |      4 |            pp64 |        108.81 ± 0.81 |
| qwen35 9B Q4_K - Medium        |  5.55 GiB |    8.95 B | MTL,BLAS  |      4 |            tg64 |          9.23 ± 0.13 |
| qwen35 9B Q4_K - Medium        |  5.55 GiB |    8.95 B | MTL,BLAS  |      5 |            pp64 |        108.59 ± 0.62 |
| qwen35 9B Q4_K - Medium        |  5.55 GiB |    8.95 B | MTL,BLAS  |      5 |            tg64 |          9.20 ± 0.01 |
| qwen35 9B Q4_K - Medium        |  5.55 GiB |    8.95 B | MTL,BLAS  |      6 |            pp64 |        108.57 ± 1.21 |
| qwen35 9B Q4_K - Medium        |  5.55 GiB |    8.95 B | MTL,BLAS  |      6 |            tg64 |          9.12 ± 0.01 |
| qwen35 9B Q4_K - Medium        |  5.55 GiB |    8.95 B | MTL,BLAS  |      7 |            pp64 |        108.34 ± 1.24 |
| qwen35 9B Q4_K - Medium        |  5.55 GiB |    8.95 B | MTL,BLAS  |      7 |            tg64 |          9.02 ± 0.04 |
| qwen35 9B Q4_K - Medium        |  5.55 GiB |    8.95 B | MTL,BLAS  |      8 |            pp64 |        108.07 ± 1.57 |
| qwen35 9B Q4_K - Medium        |  5.55 GiB |    8.95 B | MTL,BLAS  |      8 |            tg64 |          8.97 ± 0.01 |
| qwen35 9B Q4_K - Medium        |  5.55 GiB |    8.95 B | MTL,BLAS  |      9 |            pp64 |        107.81 ± 0.76 |
| qwen35 9B Q4_K - Medium        |  5.55 GiB |    8.95 B | MTL,BLAS  |      9 |            tg64 |          8.76 ± 0.15 |
| qwen35 9B Q4_K - Medium        |  5.55 GiB |    8.95 B | MTL,BLAS  |      10 |            pp64 |        107.75 ± 0.58 |
| qwen35 9B Q4_K - Medium        |  5.55 GiB |    8.95 B | MTL,BLAS  |      10 |            tg64 |          8.82 ± 0.03 |
| qwen35 9B Q4_K - Medium        |  5.55 GiB |    8.95 B | MTL,BLAS  |      11 |            pp64 |        107.64 ± 0.44 |
| qwen35 9B Q4_K - Medium        |  5.55 GiB |    8.95 B | MTL,BLAS  |      11 |            tg64 |          8.71 ± 0.03 |
| qwen35 9B Q4_K - Medium        |  5.55 GiB |    8.95 B | MTL,BLAS  |      12 |            pp64 |        107.18 ± 0.34 |
| qwen35 9B Q4_K - Medium        |  5.55 GiB |    8.95 B | MTL,BLAS  |      12 |            tg64 |          8.67 ± 0.05 |

This confirms that the number of threads doesn't matter. On my desktop 7800X3D 4 threads gives pretty much the fastest tokens per second.

News:

Apple MacBook Air 15 M5 Review - Very powerful, fanless and without competition

juri

How tastes differ

Interesting for reviewers

only bandwidth = more tg

only bandwidth = more tg

some llama.cpp benchmarks

some llama.cpp benchmarks

Quick Reply