Hacker News new | ask | show | jobs
by xg15 497 days ago
Just read the abstract so far. Sounds amazing, but just for the sake of understanding, what would be the inputs and outputs of such a system? If the prompt is generated, how do you tell the system what you'd like to have? And what is the ground truth that is trained against? Examples of the desired text?
1 comments

I think what they mean is intermediate prompts, i.e. the prompts that the system gives to itself when solving a problem that requires multiple stages.
Ah, that makes sense. So (very) basically, they're putting a number of regular LLMs into a sort of compute chain/graph, where one LLM feeds into the other, then doing gradient descent on the whole chain at once, essentialy treating the boundaries between LLM n and LLM n+1 as "hidden layers"?
Author here. Yea, in this fashion. And it can create the feedback using llm as a backward engine