Hacker News new | ask | show | jobs
by lblume 455 days ago
Small LLM weights are not really interesting though. I am currently training GPT-2 small sized models for a scientific project right, and their world models are just not good enough to generate any kind of real insight about the world it was trained in except for corpus biases.
2 comments

Small large language models? This sounds like the apocryphal headline when a spiritualist with dwarfism escaped prison: "Small medium at large." Do you also have some dehydrated water and a secure key escrow system?
A collection of newspapers is generally a better source than a single leaflet, but even a leaflet is a piece of history.