> It's kinda hard to believe that someone would stumble onto the landmine of AI performance comparison between Apple Silicon and Nvidia hardware.
I encourage you to update your beliefs about other people. I’m a very technical person, but I work in robotics closer to the hardware level - I design motor controllers and Linux motherboards and write firmware and platform level robotics stacks, but I’ve never done any work that required running inference in a professional capacity. I’ve played with machine learning, even collecting and hand labeling my own dataset and training a semantic segmentation network. But I’ve only ever had my little desktop with one Nvidia card to run it all. Back in the day, performance of CNNs was very important and I might have looked at benchmarks, but since the dawn of LLMs, my ability to run networks has been limited entirely by RAM constraints, not other factors like tokens per second. So when I heard that MacBooks have shared memory and can run large models with it, I started to notice that could be a (relatively) accessible way to run larger models. I can’t even remotely afford a $6k Mac any more than I could afford a $12k Nvidia cluster machine, so I never really got to the practical considerations of whether there would be any serious performance concerns. It has been idle thinking like “hmm I wonder how well that would work”.
So I asked the question. I said roughly “hey can someone explain why OP didn’t go with this cheaper solution”. The very simple answer is that it would be much slower and the performance per dollar would be 10x worse. Great! Question answered. All this rude incredulousness coming from people who cannot fathom that another person might not know the answer is really odd to me. I simply never even thought to check benchmarks because it was never a real consideration for me to buy a system.
Also the “#1 topic in the tech sector right now” funny in my circles people are talking about unions, AI compute exacerbating climate change, and AI being used to disenfranchise and make more precarious the tech working class. We all live in bubbles.
It's simply bizarre that you would ask that question when the research to figure it all out is trivially accessed. Everyone thought that "unified memory" would be a boon when it was advertised, but Apple never delivered on a CUDA alternative. They killed OpenCL in the cradle, and pushed developers to use Metal Compute Shaders instead of a proper GPGPU layer. If you are an Apple dev, the mere existence of CoreML ought to be the white flag that makes you realize Apple hardware was never made for GPU compute.
Again, I'm not accusing you of bad-faith. I'm just saying that asking such a bald-faced and easily-Googled question is indistinguishable from flamebait. There is so much signalling that should suggest to you that Apple hardware is far from optimized for AI workloads. You can look at it from the software angle, where Apple has no accessible GPGPU primitives. You can look at it from a hardware perspective, where Apple cannot beat the performance-per-watt of desktop or datacenter Nvidia hardware. You can look at it from a practical perspective, where literally nobody is using Apple Silicon for cost-effective inference or training. Every single scrap of salient evidence suggests that Apple just doesn't care about AI and the industry cannot be bothered to do Apple's dirty work for them. Hell, even a passing familiarity with the existence of Xserve should say everything you need to know about Apple competing in markets they can't manipulate.
> funny in my circles people are talking about unions, AI compute exacerbating climate change, and AI being used to disenfranchise and make more precarious the tech working class.
Sounds like your circles aren't focused on technology, but popular culture and Twitter topics. Unionization, the "cost" of cloud and fictional AI-dominated futures were barely cutting-edge in the 90s, let alone today.
I encourage you to update your beliefs about other people. I’m a very technical person, but I work in robotics closer to the hardware level - I design motor controllers and Linux motherboards and write firmware and platform level robotics stacks, but I’ve never done any work that required running inference in a professional capacity. I’ve played with machine learning, even collecting and hand labeling my own dataset and training a semantic segmentation network. But I’ve only ever had my little desktop with one Nvidia card to run it all. Back in the day, performance of CNNs was very important and I might have looked at benchmarks, but since the dawn of LLMs, my ability to run networks has been limited entirely by RAM constraints, not other factors like tokens per second. So when I heard that MacBooks have shared memory and can run large models with it, I started to notice that could be a (relatively) accessible way to run larger models. I can’t even remotely afford a $6k Mac any more than I could afford a $12k Nvidia cluster machine, so I never really got to the practical considerations of whether there would be any serious performance concerns. It has been idle thinking like “hmm I wonder how well that would work”.
So I asked the question. I said roughly “hey can someone explain why OP didn’t go with this cheaper solution”. The very simple answer is that it would be much slower and the performance per dollar would be 10x worse. Great! Question answered. All this rude incredulousness coming from people who cannot fathom that another person might not know the answer is really odd to me. I simply never even thought to check benchmarks because it was never a real consideration for me to buy a system.
Also the “#1 topic in the tech sector right now” funny in my circles people are talking about unions, AI compute exacerbating climate change, and AI being used to disenfranchise and make more precarious the tech working class. We all live in bubbles.