| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by kmacdough 136 days ago

What are we testing here?

It feels like a very odd test because it's such an unreasonable way to answer this with an LLM. Nothing about the task requires more than a very localized understanding. It's not like a codebase or corporate documentation, where there's a lot of interconectedness and context that's important. It also doesn't seem to poke at the gap between human and AI intelligence.

Why are people excited? What am I missing?