| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by kcorbitt 469 days ago
	To be honest, I don't expect the performance to generalize to other task types with this specific training regime. If we had a panel of like 30 logic puzzles and cross-trained against all of them simultaneously it might though. I think there's a lot of benefit to discovering a training regime that allows small specialized models to do extremely well in one narrow task; if we can figure out how to make small models that beat SOTA on a specific task and are cheap to train and run, that's in some ways a more useful outcome than a very large model that is good at many tasks (but is more expensive to run for each of them).

2 comments

shinryuu 468 days ago

The question to me if you can call that deduction in that case. Isn't it just a type of pattern matching that fits this particular task?

link

ekidd 469 days ago

Once the problem gets narrow enough, do you risk training a model that reinvents a straightforward classic algorithm at far higher cost?

link

bradhilton 469 days ago

Well, in this case there is a much more straightforward method with the same CP-SAT solver used to create the puzzles. This is more of a fun experiment to see if we can train LLMs to solve these kinds of logical deduction problems.

link