| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by jzapletal 123 days ago

We built an open-source tool that screenshots your desktop and feeds summaries to Claude/Cursor via MCP.

What surprised us:

- Cost: $0.0002/screenshot (we budgeted 100x more), guess cloud vision APIs got cheap fast

- CPU: 5% (exp. 50%) and laptop stays cool

- Quality: night and day vs local models, we tried running vision locally first and it was mediocre

It works by triggering a screenshot on activity, sending it to a cloud vision model for summarization, then deleting the screenshot and storing only the text in local SQLite. You query it via MCP – "what was I working on before lunch?" and Claude actually knows.

2 comments

quinncom 123 days ago

Screen sharing to any remote API is a nonstarter for me. I don’t care if the API claims ZDR; Snowden’s revelations are still echoing. So, I appreciate that the app supports a custom endpoint for local models.

Which local models did you try? GLM-OCR seems like it would excel at this: https://huggingface.co/zai-org/GLM-OCR

link

quinncom 123 days ago

I've got it installed with Qwen3-VL-4B running in LM Studio on my MBP M1 Pro. (Yes, the fans are running.) GLM-OCR didn't work because it returns all text on the screen, despite the instructions asking only for a summary.

Screenshots are summarized in ~28 seconds. Here's the last one:

> "The user switched to the Hacker News tab, displaying item 47049307 with a “Gave Claude photographic memory for $0.0002/screenshot” headline. The chat now shows “Sonnet 4.6” and a message asking “What have I been doing in the past 10 minutes?” profile, replacing prior Signal content. The satellite map background remains unchanged."

The satellite map background remains unchanged message appears in every summary (my desktop background is a random Google Maps satellite image that rotates every hour).

I would like to experiment with custom model instructions – for example, to ignore desktop background images.

Earlier in my testing it was sending screenshots for both of my displays at the same time, which was much slower, but now it's only sending screenshots of my main screen. Does MemoryLane only send screenshots for displays that have active windows?

Here's the first test of the MCP server in Claude – https://ss.strco.de/SCR-20260217-onbp.png – it works!

link

quinncom 123 days ago

Update: I switched to Qwen3 VL 2B (`qwen3-vl-2b-instruct-mlx@bf16`) which is 2.5× faster than 4B (11s vs 18s per screenshot) and my meager M1 Pro is able to keep up without the fans spinning 100% of the time.

link

BloondAndDoom 123 days ago

This is great stuff, have you tried with local models? Summarization etc. is easy but I haven’t played with image to text models locally? Any ideas. I can run 32b models fine and for summarization kind of tasks they are extremely good I’d even say more than necessary

link

fidorka 123 days ago

Hey, just released a new version with support for local models - you just configure the custom endpoint and model name and it should just work. Let us know what you think:)

link