| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by piker 246 days ago
	This looks really cool for prototyping and playing around. It seems to me though if one is building a modern application that needs to get image segmentation and/or text recognition right there are better APIs available than natural language? It seems like a lot of effort to make a production-scale CV application to weigh it down with all of an LLM’s shortcomings. Not a field I’m familiar with but I would assume that this doesn’t produce state of the art results—that would change the analysis.

2 comments

As a hobby photographer, I organise everything for speedy retrieval but this would be amazing to search my collection.

Imagine you build an image segmentation model for a e.g. specific industrial application.

With this LLM approach you can at least create your training data from the raw images with natural language.

That does make sense