Hacker News new | ask | show | jobs
by darepublic 243 days ago
I don't quite understand how you get from this:

> I wanted to understand how these things work by building one myself.

Directly to this:

What if training an LLM was as easy as npx create-next-app?

I mean that the second thought seems to be the opposite of the first (what if the entirety of training llm was abstracted behind a simple command)

1 comments

Great question - I should've been clearer.

When I started, I wanted to understand LLMs deeply. But I hit a wall: tutorials were either "hello world" toys or "here's 500 lines of setup before you start."

What I needed was: "give me working code quickly, THEN let me modify and learn."

That's what create-llm does. It scaffolds the boilerplate (like create-next-app), so you can spend time learning the interesting parts: - Why does vocab size matter? (adjust config, see results) - What causes overfitting? (train on small data, see it happen) - How do different architectures perform? (swap templates, compare)

It's "easy to start, deep to master." The abstraction gets you running in 60 seconds, then you dig into the code