| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by ai_what 745 days ago
	It won't. Amazon kind of when that angle with MistralLite[1] (a 7B finetune), and it was barely passing in terms of being an effective summarizer. 0.5B are pretty much useless. https://huggingface.co/amazon/MistralLite

2 comments

coder543 745 days ago

The official Mistral-7B-v0.2 model added support for 32k context, and I think it's far better than MistralLite. Third-party finetunes are rarely amazing at the best of times.

Now, we have Mistral-7B-v0.3, which is supposedly an even better model:

https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3

link

z4y5f3 744 days ago

My experience is that < 500M models are pretty useful when fine-tuned on traditional NLP tasks, such as text classification and sentence/token level labeling. A modern LM with a 32K context window size could be a nice replacement for BERT, RoBERTa, BART.

link