The trick is not this neural alignment - it is training on many, many more tokens than Chinchilla recommends.