| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by kacperlukawski 810 days ago
	Why is that an issue? Training the tokenizer seems much more straightforward than training the model as it is based on the statistics of the input data. I guess it may take a while for massive datasets, but is calculating the frequencies impossible to be done on a bigger scale?