Hacker News new | ask | show | jobs
by wokwokwok 1136 days ago
> The smallest/fastest model that is accurate enough for your use case is ideal.

Sure.

…but it’s also fair to say that the smallest model that can fit your use case will be bounded by the parameter count.

No amount of training data can make 100 param model do text summarisation.

If you have a 3B param model, and you want a chat-GPT to embed in your app, do you think it’ll do?

I don’t.

The output is not at that quality level, because it’s too small.

Not everyone needs that; but these 3B / 7B models don’t have the capability to do everything.

1 comments

You could do what letter is most likely to come next