Hacker News new | ask | show | jobs
Show HN: I made a Gemma 4 Mac app that names screenshots with local AI (snapname.app)
7 points by joas_coder 23 days ago
I made my first macOS utility app that ships with a bundled Gemma 4 model, specifically the Gemma E4B one. It made my app DMG have 5.3 GB in size, but I think it is a small size for the power that this free local model can provide.

It runs fine on CPU, but can also run on Apple Silicon GPU, although I did not notice any performance improvements with GPU (tested on a M5 chip).

I think these local lightweight and multimodal models will open multiple possibilities for new software tools where privacy is essential.

4 comments

As clever as this is, it seems like the names are fairly straightforward (as you'd want!) – did you try using the on-device Apple Foundation model at all? That's actually pretty powerful for a use case like this, and if you're happy to require the user has Apple Intelligence turned on already, your shipped app can end up being tiny. The biggest concern for an app like this is how much RAM you end up using trying to run it. Especially if we end up with lots of different apps all doing the same thing.

Being able to super-power apps with on-device models is a lot of fun. I recently did the same building my own dictation app using small local models, and I still can't believe how effective it is. The download is just 20mb, though it will download parakeet ~475mb for audio, but can use the on-device model as the second-pass LLM and works pretty well (though better models are available to download and use e.g. Llama 3.2 4bit and Qwen 2.5 7B 4bit)

I'm currently building a little tool for a professional photographer friend to go through and classify images in their photoshoots, so I can build a searchable db for them to quickly find very specific images in the future. I simply don't think it would have been possible for me to build a tool like that just a couple years ago at any price.

Thanks for the feedback. I did not know my Mac had an on-device Apple Foundation model. Is it multimodal? I'll be checking it out and comparing it with Google Gemma 4. I thought Apple was out of the AI model race.

The idea is to ship more powerful lightweight free models as they become available. I'm looking forward to Gemma 5!

> The biggest concern for an app like this is how much RAM you end up using trying to run it

You are totally right. A new feature for a future version would be to turn off the model when the app is idle. And only launch it next time the user takes a screenshot. It is a trade-off between latency to generate the names and memory RAM.

It's not as powerful as Gemma 4, but I think they likened it to GPT-3. It's perfectly capable of looking at images and classifying them at the level you'll need for this app. And it runs everything on the Apple Neural engines, so decently quick. Of course, this assumes that your users are using Apple Silicon processors, I believe that's the limitation – and they must have enabled Apple Intelligence which downloads the model at that point.
And a quick YouTube short video on how to put the app on auto-pilot so that you don't have to do anything to have your screenshots receive meaningful names: https://youtube.com/shorts/8bxhBgJvp7M
For anyone who wants to see the workflow before downloading the large app bundle, here’s a short demo: https://www.youtube.com/watch?v=QIt2H_CUYBM
Nice use of local AI!