Hacker News new | ask | show | jobs
by mettamage 520 days ago
How come models can be so small now? I don't know a lot about AI, but is there an ELI5 for a software engineer that knows a bit about AI?

For context: I've made some simple neural nets with backprop. I read [1].

[1] http://neuralnetworksanddeeplearning.com/

1 comments

You can find the phi-4 technical report [here](https://www.microsoft.com/en-us/research/uploads/prod/2024/1...)

The brief of it is by curating a smaller synthetic dataset of high quality from textbooks, problem sets, etc. instead of dumping a massive dataset with tons of information.