|
|
|
|
|
by robgough
20 days ago
|
|
As clever as this is, it seems like the names are fairly straightforward (as you'd want!) – did you try using the on-device Apple Foundation model at all? That's actually pretty powerful for a use case like this, and if you're happy to require the user has Apple Intelligence turned on already, your shipped app can end up being tiny. The biggest concern for an app like this is how much RAM you end up using trying to run it. Especially if we end up with lots of different apps all doing the same thing. Being able to super-power apps with on-device models is a lot of fun. I recently did the same building my own dictation app using small local models, and I still can't believe how effective it is. The download is just 20mb, though it will download parakeet ~475mb for audio, but can use the on-device model as the second-pass LLM and works pretty well (though better models are available to download and use e.g. Llama 3.2 4bit and Qwen 2.5 7B 4bit) I'm currently building a little tool for a professional photographer friend to go through and classify images in their photoshoots, so I can build a searchable db for them to quickly find very specific images in the future. I simply don't think it would have been possible for me to build a tool like that just a couple years ago at any price. |
|
The idea is to ship more powerful lightweight free models as they become available. I'm looking forward to Gemma 5!
> The biggest concern for an app like this is how much RAM you end up using trying to run it
You are totally right. A new feature for a future version would be to turn off the model when the app is idle. And only launch it next time the user takes a screenshot. It is a trade-off between latency to generate the names and memory RAM.