Hacker News new | ask | show | jobs
by stevage 503 days ago
Can anyone ELI5? Or at least a kind of layman's explanation?
3 comments

Consider a complex LLM pipeline with multiple steps. each LLM evaluation has an associated prompt to shape the style/context of the response. Conventionally these prompts are treated like hyperparameters that have to be manually adjusted to get the desired behavior from the LLM.

This work introduces a way to treat these prompts like trainable parameters, updating them through automatic differentiation of some kind of supervised training loss.

For me it kind of feels like deep dream or style transfer, which use autograd to optimize the model inputs (instead of the parameters) to achieve some goal (like mixing the style and content of two input images)

This paper suggests that LLMs can be trained to handle multi-stage questioning by automatically optimizing prompts using feedback-based methods, improving their ability to process complex, multi-step interactions.
less words makes more prompt