Hacker News new | ask | show | jobs
by EvgeniyZh 1233 days ago
Attention, scaling laws, diffusion, vision transformers, Bert/Roberta, CLIP, chinchilla, chatgpt-related papers, nerf, flamingo, RETRO/some retrieval sota
1 comments

what do you mean 'scaling laws'?
J. Kaplan, S. McCandlish, T. Henighan, T. B. Brown, B. Chess, R. Child, S. Gray, A. Radford, J. Wu, and D. Amodei. Scaling laws for neural language models. arXiv preprint arXiv:2001.08361, 2020.

and multiple follow-ups