Hacker News new | ask | show | jobs
by lixy 940 days ago
It seems plausible you could have the LLM side call upon its knowledge of known problems and answers to quiz the q-learning side.

While this would still rely on a knowledge base in the LLM, I would imagine it could simplify the effort required to train reinforcement learning models, while widening the domains it could apply to.