Mac Studio M3 Ultra with 96 GB
Is it good for running large models locally?
The M3 Ultra's GPU is a bit on the weak side for large-scale inference, so you'll be waiting on token prefill for most coding/agent workflows.
Have you tried any other models with this M3 Ultra?
Apple's GPUs are just not very fast for inference. I'd stick to the smaller 7b-18b parameter range or MOE models like Qwen if you want a usable inference speed.
Any thoughts on M5?
They may be soon releasing a M5 model with mac studio/mini.
$4,699.00
But looks like we may need a NVIDIA AI Enterprise - DGX Spark License