Hacker News new | ask | show | jobs
by CapsAdmin 1136 days ago
> why don't we just train it on a CS curriculum instead of millions of examples of code written by humans?

I've never studied computer science formally but I doubt students learn only from the CS curriculum? I don't even know how much knowledge CS curriculum entails but I don't for example see anything wrong including example code written by humans.

Surely students will collectively also learn from millions of code examples online alongside the study. I'm sure teachers also do the same.

A language model can also only learn from text, so what about all the implicit knowledge and verbal communication?

1 comments

What they are saying is that if you’ve studied computer science , you should be able to write a computer program without storing millions or billions of lines of code from GitHub in your brain.

A CS graduate could workout how to write software without doing that.

So they’re just pointing out the difference in “learning”.

LLM's are not storing millions or billions lines of code, and neither do we. Both store something more general and abstract.

But I'm saying there's a big difference between a CS graduate and some current LLM that learns from "the CS curriculum". A CS graduate can ask questions, use google to learn about things outside of school, work on hobby projects, study existing code outside of what's shown in university, get compiler feedback when things go wrong, etc.

All a language model can do is read text and try to predict what comes next.