This seems high. At which quantization? Using LM Studio or something else?
Note: Darkbloom seems to run everything on Q8 MLX.