Hacker News new | ask | show | jobs
by simonw 407 days ago
MLX reports peak memory usage at the end of the response. Otherwise I'll use Activity Monitor.
1 comments

I'm also trusting `get_peak_memory` + some small buffer for now.

Still, it reports accurate peak memory usage for tensors living on GPU, but seems to miss some of the non-Metal overhead, however small (https://github.com/aukejw/mlx_transformers_benchmark/issues/...).