Quote from: en.wikipedia.org/wiki/Apple_M5#PerformancePeak GPU AI compute: over 4× fasterdoes not apply to running 3rd party LLMs and only the RAM/unified memory bandwidth increase increases the token generation (28% = 1.28 = 153.6 GB/s (M5) / 120 GB/s (M4)). (153.6 GB/s = 128-bit * 9600 MT/s / 1000 / 8)
Quote from: youtube.com/watch?v=HKxIGgyeISMApple's Energy Model - Deconstructed
In this video, I reverse engineer Apple's Energy Model on the Mac Studio M4 Max. In the process I explain why and how measured DC power can appear up to 3 times higher than reported M4 Max GPU power.
[..]
Quote from: juri on March 13, 2026, 03:09:11and still no matte option for the screen, so idiotic without any reason.It's not 12W, look again. Maybe the matte coating on your screen played a prank on you ;)
Quote from: not_anton on March 11, 2026, 17:16:44Good to know, but will those quants fit (if I had ot guess I'd say yes)? There's also sysctl iogpu.wired_limit_mb=<MB> (I know not to assign too much to the VRAM (1-3 GB may be ok), as the OS may start to write/swap to the SSD).Quote from: Will MBAir fit 27B quants on March 11, 2026, 12:23:33Will the 24 GB RAM option fit Qwen3.5-27B-UD-Q4_K_XL.gguf (17.6 GB) or Qwen3.5-27B-UD-Q5_K_XL.gguf (20.2 GB)? (huggingface.co/unsloth/Qwen3.5-27B-GGUF) (I know there's mlx-community/Qwen3.5-27B-4bit (16.1 GB) too, but I don't know if its perplexity is good)I have a 15" M2 Air with 24GB RAM, but it can only run 3B models max because of overheating. Work is fine, gaming is fine, but LLMs throttle it to 0.4GHz on GPU in a minute. Sorry, you would need something with a fan or two to make those models useful.
Quote from: Will MBAir fit 27B quants on March 11, 2026, 12:23:33Will the 24 GB RAM option fit Qwen3.5-27B-UD-Q4_K_XL.gguf (17.6 GB) or Qwen3.5-27B-UD-Q5_K_XL.gguf (20.2 GB)? (huggingface.co/unsloth/Qwen3.5-27B-GGUF) (I know there's mlx-community/Qwen3.5-27B-4bit (16.1 GB) too, but I don't know if its perplexity is good)I have a 15" M2 Air with 24GB RAM, but it can only run 3B models max because of overheating. Work is fine, gaming is fine, but LLMs throttle it to 0.4GHz on GPU in a minute. Sorry, you would need something with a fan or two to make those models useful.
Quote from: dumb_oems on March 09, 2026, 11:44:29This is not emphasized enough
...
the actual user experience, especially on battery power
Quote from: JimD on March 09, 2026, 12:11:02Fanless is not really an attraction for me.