Hacker News new | ask | show | jobs
by vedant 3292 days ago
One reason is that the amount of training data is many many orders of magnitude smaller.

FWIW it seems the structure you're talking about exploiting is at a morphological and syntactic level, which modern language models tend to effectively handle. Semantics are a much harder problem.