| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by a-t-c-g 83 days ago
	The quality of custom models trained with proper reasoning datasets[0] even with small parameters (3-7B is sweet spot) is incredible now [0]: cartesien.io or Salesforce's WebscaleRL

1 comments

What are you basing how good they are on? Personal experience or some benchmarks?

Benchmarks, we have internal ones testing reasoning fine-tuned v/s frontier + prompts

For some use cases it can be parity performance at 1/20th the cost up to exceeds at 1/10th the cost. Trade-off is ofc narrow applicability

How can I learn more about these models? Are they open source?

there are plenty of OSS finetuned models + base models around. If you're looking for doing these on your own dataset, worth getting in touch with cartesien.io or wire up https://github.com/SalesforceAIResearch/PretrainRL-pipeline

Thank you.