Hacker News new | ask | show | jobs
by kuhewa 591 days ago
If the oatmeal hadn't sat in old milk for a week, yes, it would not likely cause gi distress
1 comments

and yet if it had, it would.
Let's make one of your breakfast foods eggs. Sometimes you won't notice from the walk if they just had normal eggs. However, when they are also rotten and were over indulged in, you can tell from the person's walk that they ate them for breakfast and due to the noxious sulfur smell emanating from the diarrhoea in the person's business casual slacks, you can tell with a high degree of confidence it wasn't the rotten milk cornflakes, but one of a very few number of sulfur-rich foods, probably eggs.

Bad human resume text and overuse of unmodified LLM output are both detectable, but they are detectable because they are bad in quite different ways.

Regarding the original resume reader's notion that they can detect LLM text with a high degree of accuracy, it is not their LLM output detection specificity I would take issue with (similarly, despite stating validation is critical, I would bet you, too, are pretty confident when you see an entire page of blogspam or marketih copy that you regard as LLM generated despite it rarely being marked as such). Rather, it is their sensitivity, as I am sure occasional use and especially slightly modified output from LLMs gets by them now and again without them knowing.

yes, it's like men who think they can always spot makeup on a woman. or the economists who predicted 19 of the last 5 market crashes. no need for external validation.
The makeup is a great analogu. I can bet with 99.9% that when I say a woman is wearing makeup that I'm correct (or it is tattoed on or similar). When it's obvious, it's obvious. However I don't detect makeup on women quite often.

The economists is not not as good of an analogy, almost converse of the makeup example as that is a high rate of false positives.

that's why I gave both examples - you have no evidence which camp you are in, without ground truth. if you add gut feeling to gut feeling, you don't get evidence.
One camp is a high specificity camp, the other is a high sensitivity camp. You definitely know what camp you are in if you are only making the argument you can detect true cases of LLM use without many false positives- I've already admitted less blatant LLM use goes undetected so I am not arguing for high sensitivity.

And we have plenty of evidence of hallmarks of LLM use, we can even replicate the LLM resume generation process if we wanted. There is plenty of useful "training data" available even if you don't have a validated set of resumes submitted for this type of role at this type of company from this demographic of applicants.

Basically what you are trying to argue is that you can't have confidence that the animals you see people walking down the street on leashes are dogs unless you ask the owners whether they are dogs are not... AND that it doesn't matter that dogs are highly distinct from other domestic pets AND that we've seen many verified dogs before in other contexts, AND have even bred different varieties of dogs on our own.

I highly doubt that you maintain that standard for inductive inference across the board in your own practice. Life would be very difficult if you refused to make inferences about novel things (with any confidence) based on generalised patterns derived from other, similar cases.