Hacker News new | ask | show | jobs
by meame2010 662 days ago
Thanks for the insightful response. Good point on using 4o-mini to save cost. I'll try it out.

I will check more into the soft-prompt tuning.

For the current scope, we are focused on in-context learning, ways to improve model reasoning at the inference time.

We use auto-differentiative framework (backpropagation) to do zero-shot instruction optimization and few-shot demonstration. currently even just zero-shot can often surpass Dspy's few-shots (as many as 40 shots). And I have come up a training paradigm that will (1) start zero-shot (2) review performance from advanced teacher model to see if we can have a gap to gain from the teacher. (3) if there is a gap to teacher, we start to do low-shot demonstrations, and gradually increase the number of shots.