| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by sinandrei 135 days ago
	Has anyone experiment with using VLM to detect "marks"? Thinking of pen/pencil based markings like underlines, circles,checkmarks.. Can these models do it?

1 comments

leetharris 135 days ago

None of them do it well from our experience. We had to write our own custom pipeline with a mixture of legacy CV approaches to handle this (AI contract analysis). We constantly benchmark every new multimodal and VLM model that comes out and are consistently disappointed.

link

coder543 135 days ago

If someone releases a benchmark/dataset, I'm sure that significantly increases the chances of one of these AI labs training on the task.

link