Hacker News new | ask | show | jobs
by embedding-shape 24 days ago
But are you seriously under the belief that all of that, plus all the other things you're forgetting about, is easier, cheaper and faster than transcriptions and translations?

I understand and agree building the LLMs yourself comes with more benefits, long-term ones especially, but still it's harder, more expensive and really time consuming work.

1 comments

I do not know which is easier. I am not sure that is even well established in research for generative text tasks whether a translation-first or native-language-first is the most sample efficient?

But for a national lab I think it is money well spent to figure out the possibilities and limitations of a native-language LLMs for languages with order of 5M-10M speakers.