| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by famouswaffles 1207 days ago

1. No it's not lol. If the model was only trained on that much data, it wouldn't be anywhere near as good in french. 1.8% is only enough here because it trained on other languages as well.

GPT-3 is also fluent in languages with less training data.

3. LLMs trained on code score noticeably higher on reasoning benchmarks

1 comments

Donckele 1206 days ago

lol?

1.8% does look like a small number but imagine (i know its hard in this day and age with 4TB finger nail usb sticks) a physical library holding good old fashioned paper made artifacts and what does 1.8% of that looks like?

link