Y
Hacker News
new
|
ask
|
show
|
jobs
by
ericflo
1123 days ago
I would also check out their 3B model. I tested it on launch with LoRA fine-tuning and found it to be surprisingly capable despite its size. I think a lot of people are skipping past testing it because it only has 3B params.
Edit:
https://huggingface.co/togethercomputer/RedPajama-INCITE-Bas...