Hacker News new | ask | show | jobs
by abeppu 611 days ago
This is a bad and shallow article. Critically, n-gram models are literally just counting in the way described. If you can't account for the difference in behavior and performance between a LLMs and an n-gram model of either similar parameter size or based on a similar number of tokens, then saying that LLMs are "just" counting votes is misleading.