Hacker News new | ask | show | jobs
by EagnaIonat 3 hours ago
> MacBooks with their unified memory behave like a slow GPU with enormous amount of video RAM. So you can run large smart models slowly.

With the model using MLX the speed increase is night and day. Even non-MLX is good.

You also don't have the transfer costs related to moving CPU data into the GPU.