>the time it takes it to learn what works/doesn't work widens.
From the raw scaling laws we already knew that a new base model may peter out in this run or the next with some amount of uncertainty--"the intersection point is sensitive to the precise power-law parameters":