Hacker News new | ask | show | jobs
by Gormo 213 days ago
But the clocks in this demo aren't images.
1 comments

Yes, but they are reasoning within their dataset, which will contain multiple example of html+css clocks.

They are just struggling to produce good results because they are language models and don’t have great spatial reasoning skills, because they are language models.

Their output normally has all the elements, just not in the right place/shape/orientation.