Hacker News new | ask | show | jobs
by jxnlco 945 days ago
usually higher parameter models do better with less training data, seperate from few shot learners, but related in other ways.