| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by woodson 793 days ago
	Especially for small models I had very bad results for use in translation. Even trying all kinds of tricks didn’t help (apparently prompting in the target language helps for some). Encoder-decoder models such as FLAN-T5 or MADLAD-400 seemed far superior at equal or even smaller model size.

1 comments

I forget which model (LLaMA 3?) but I heard 95% of the training data was English.