Hacker News new | ask | show | jobs
by ShamelessC 1105 days ago
This is a trend I’ve noticed lately. An article attempting to make a sweeping generalization about the nature of LLM’s/diffusion deliberately cherry picks only examples which support their argument. They will include chatGPT but using 3.5 turbo instead of 4. Commenters then realize that most/all such “evidence” is working just fine in GPT-4.

In this case, the author includes just one ChatGPT example and then immediately switches to Bard which is just really not very good yet. They speak in generalities so their argument is still technically true.

Really frustrating. It’s clearly someone looking to confirm their pre-existing notions. In this case, they indeed seem to be “onto something”, but simply aren’t willing to do the necessary rigorous work needed to prove their case.

Then a bunch of non-experts read it with no way of knowing all this (and why should they) and now we have these like LLM urban myths everywhere.

3 comments

There is a literature on arXiv where people evaluate a range of prompts, you really want N > 100, not the N = 1 that you see in blog posts.
It’s the default mode for humans, we believe something first and then add our reasoning to it. Aka confirmation bias and belief bias.

https://effectiviology.com/belief-bias/

I think this is so widespread! Investigation in your biases is always worthwhile.

LLMs are useless! I was curious, so I ended up initializing one with 500 Billion parameters. I trained for a whole 4 hours on a whopping 100 books. It still doesn't know anything! Awful. Sad. Clearly, they can't reason.

\s