Hacker News new | ask | show | jobs
by andai 457 days ago
Yesterday GPT asked me if I'd like to train a small LLM and I laughed out loud.

That being said I'm amazed how far 1B models have come. I remember when TinyLlama came out a few years ago, it was not great. ($40K training cost iirc.)

That was a 1B model, but these days even 0.5B models are remarkably coherent.