Hacker News new | ask | show | jobs
by ericflo 1123 days ago
I would also check out their 3B model. I tested it on launch with LoRA fine-tuning and found it to be surprisingly capable despite its size. I think a lot of people are skipping past testing it because it only has 3B params.

Edit: https://huggingface.co/togethercomputer/RedPajama-INCITE-Bas...