| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by Arctic_fly 1009 days ago

> Llama 7B wasn't up to the task fyi, producing very poor translations.

From what I've read and personally experimented with, none of the Llama 2 models are well-suited to translation in particular (they were mainly trained on English data). Still, there are a number of tasks that they're really good at if fine-tuned correctly, such as classification and data extraction.

> I believe that OpenAI priced GPT-3.5 aggressively cheap in order to make it a non-brainer to rely on them rather than relying on other vendors (even open source models).

I think you're definitely right about that, and in most cases just using GPT 3.5 for one-off tasks makes the most sense. I think when you get into production workflows that scale, that's when using a small fine-tuned models starts making more sense. You can drop the system prompt and get data in the format you'd expect it in, and train on GPT-4's output to sometimes get better accuracy than 3.5 would give you right off the bat. And keep in mind, while you can do the same thing with a fine-tuned 3.5 model, it's going to cost 8x the base 3.5 price per token.

1 comments

kelseyfrog 1009 days ago

Is that because translation is typically an encoder-decoder task and llama is decoder only or is there something else about it that makes the last difficult for llama?

link

FeepingCreature 1009 days ago

If you don't make it learn other-language texts, it won't be able to speak that language.

link

mikewang 1008 days ago

As I learned that 85% of its trainig data is English. Othere languanges composed of 15%.

link