| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by fudged71 951 days ago

Classic HN response.

This is just an early taste of a potentially powerful use case.

I understand the vision API doesn’t have memory, so each screenshot it takes is like an entire new context. If the script/application is able to send WHAT application it’s in, and has some RAG database in the backend to pull knowledge from, this would be incredibly useful.

Of course it’s slow now. If you’re legitimately stuck, a couple seconds for a personalized answer is a perfect trade off. It will get better.

3 comments

passion__desire 950 days ago

I think every UI application should start logging actions the user takes so that AI could learn the mappings from actions to visual output. It would be amazing form of data.

link

low_tech_love 950 days ago

I could say your comment is a classic 2023 HN comment..? There is no reason to be overly optimistic anbout other people’s products. Plus, nobody said “oh wow this will never work”, it’s just currently quite bad.

link

_factor 950 days ago

I couldn’t hear it perfectly, but I’m pretty sure the instructions it provided were to transform the vertices of the cube to make the sphere. It’s like using MS Frontpage. It may look right, but it’s a convoluted mess underneath.

link