| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by fho 353 days ago
	Have you played around with the current vision features? I am pretty sure even gpt-4.1 can give you pretty good descriptions of e.g. screen captures, including being able to "read" and reproduce text.

1 comments

gostsamo 353 days ago

yes, there are multiple addons giving screen readers the ability to prompt ai-s for image recognition. they work rather well, btw, though the value is often situational. agentic behavior might help further, though it will need some polishing.

link