| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by rsaha7 1018 days ago

I don’t think you fully understand the scope of this project. Your thinking and arguments are limited by your understanding of what all is possible with these models.

This repository argues that LLMs can be used for more applications beyond just chat, and QnA. Based on our experimental findings (which you would have found if you had the time to go through the README under any model folder), you can see LLMs do classification tasks really well under low data situations. For 99% of startup who don’t have the luxury of holding thousands of annotated samples like FAANG, LLMs provide a good alternative to get started with few annotated samples. At the end of the day, these models are based on attention transformer architecture.

I would be curious to see some quantitative backing of your statements and not just links to huggingface’s website & conjectures.

And btw, the entire ecosystem is trying to answer a lot of these questions because we are still early to predict anything. And here you are claiming they are absolutely non-sensical for 99% of companies.

Btw did you know that a lot of companies cannot use third-party APIs because of sensitive customer data? For them, having self-hosted models is a good alternative to have. And with the likes of Llama2 and Falcon closing the performance gap, the idea of self-hosted models for tasks beyond chat does not seem far-fetched.