| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by sgk284 520 days ago

We have, and it works great! We currently do this in production, though we use it to help us optimize for consistency between task executions (vs the linked post, which is about improving the capabilities of a model).

Phrased differently, when a task has many valid and correct conclusions, this technique allows the LLM to see "How did I do similar tasks before?" and it'll tend to solve new tasks by making similar decisions it made for previous similar tasks.

Two things to note:

    - You'll typically still want to have some small epsilon where you choose to run the task without few-shots. This will help prevent mistakes from propagating forward indefinitely.

    - You can have humans correct historical examples, and use their feedback to improve the large model dynamically in real-time. This is basically FSKD where the human is the "large model" and the large foundation model is the "small model".