| HN Mirror

While “stochastic parrot” is obviously an over-simplification, and the way the phrase was originally coined in context was likewise obviously intended to be dismissive, I think the analogy holds. I see them as a lossy storage system.

I think the expectation that simply continuing to scale the transformer architecture is not likely to exhibit the type of “intelligence” for which _researchers_ are looking.

For my personal taste, the most interesting development of NLP in this latest AI wave (and LLMs in general) is RAG. I also have always wondered why the tokenization process hasn’t been deemed more important historically. To me, it seems like THE MOST critical part of how Deep Learning works.