Hacker News new | ask | show | jobs
by gabegobblegoldi 563 days ago
Markov’s paper also has links to Google papers from two different sets of authors that shows minimal advantage of pretraining. And given the small number of benchmarks using a pretrained model from Google whose provenance is not known would be counterproductive. Google likely trained it on all available benchmarks to regurgitate the best solutions of commercial tools.