Hacker News new | ask | show | jobs
by skepticATX 901 days ago
Exactly. I think that it’s very hard for us to comprehend just how much is out there on the internet.

The perfect example of that is the tikz unicorn in the Sparks paper. Seemed like a unique task, until someone found a tikz unicorn in an obscure website.

There is plenty of evidence that LLMs struggle as you move out of distribution. Which makes perfect sense as long as you stop trying to attribute what they’re doing to magic.

This doesn’t mean they’re not useful, of course. But it means that we should should be skeptical about wild capability claims until we have better evidence than a tweet, as you put it.

1 comments

They didn't actually find a unicorn ; they found other tikz animals. It still generalized to the unicorn.

This was the package: https://ctan.org/pkg/tikzlings?lang=en