Hacker News new | ask | show | jobs
by sota_pop 219 days ago
The point is that next-token prediction produces output by sampling from distributions assembled by text it has seen previously (hence stochastic). The “ding” or claim is that - like a parrot - LLMs can’t produce responses which are truly novel in concept or make logical out-of-sample leaps, only repeat from words they’ve been taught explicitly in the past.
1 comments

So you think stochastic parrot is an accurate term and not an attempt to be dismissive? So if someone woke up from a coma and asked what ChatGPT is you would say "stochastic parrot" and think you've explained things?
While “stochastic parrot” is obviously an over-simplification, and the way the phrase was originally coined in context was likewise obviously intended to be dismissive, I think the analogy holds. I see them as a lossy storage system.

I think the expectation that simply continuing to scale the transformer architecture is not likely to exhibit the type of “intelligence” for which _researchers_ are looking.

For my personal taste, the most interesting development of NLP in this latest AI wave (and LLMs in general) is RAG. I also have always wondered why the tokenization process hasn’t been deemed more important historically. To me, it seems like THE MOST critical part of how Deep Learning works.