Hacker News new | ask | show | jobs
by Benjammer 188 days ago
So the idea is what? What's the successful outcome look like for this test, in your mind? What should good software do? Respond and say there are 5 legs? Or question what kind of dog this even is? Or get confused by a nonsensical picture that doesn't quite match the prompt in a confusing way? Should it understand the concept of a dog and be able to tell you that this isn't a real dog?
2 comments

You know, I had a potential hire last week, and I was interviewing this one guy whose resume was really strong, it was exceptional in many ways plus his open-source code was looking really tight. But at the beginning of the interview, I always show the candidates the same silly code example with signed integer overflow undefined behavior baked in. I did the same here and asked him if he sees anything unusual with it, and he failed to detect it. We closed the round immediately and I disclosed no hire decision.
Does the ability to verbally detect gotchas in short conversations dealing only with text on a screen or white board really map to stronger candidates?

In actual situations you have documentation, editor, tooling, tests, and are a tad less distracted than when dealing with a job interview and all the attendant stress. Isn't the fact that he actually produces quality code in real life a stronger signal of quality?

It's bias and, from my experience, many people do not know how to assess the interviewee to extract his best. My example was luckily just a plastic example that sarcastically portrays how people nowadays are assessing LLM capabilities too. No difference.
No, it’s just a test case to demonstrate flexibility when faced with unusual circumstances