Hacker News new | ask | show | jobs
by aifooh7Keew6xoo 913 days ago
Most LLMs that are being studied popularly have not been trained with significant emphasis on arithmetic accuracy or mathematical reasoning, and those subjects represent a vanishing minority of their corpus and consequently maps poorly to the tokenization.

Essentially every obvious optimization here is currently bearing fruit simultaneously in smaller studies and incrementally larger models should continue to exhibit performance gains even without the particular focus on this area.