Isn't the expensive part of LLM's the training? My understanding is once they are trained they can often be optimized to run quite cheaply. Not as cheaply as a well designed program but cheaply enough it shouldn't be too prohibitive to run.
I'd love to be shown I'm wrong; but I thought most 'runtime' LLMs required a shit-ton of memory. Just downloading one seems to require more storage than I have on this laptop.
I don't think it's a comparison - they're just saying that it's fast enough even on old mobile hardware, so it can be used on new hardware even faster.
I don't have a problem with a background task taking a minute or something...