|
|
|
|
|
by cs702
244 days ago
|
|
Briefly, an RLM wraps an existing language model (LM) together with an environment that can dynamically manipulate the prompt that will be fed into the LM. The authors use as an environment a Python REPL that itself can call other instances of the LM. The prompt is programmatically manipulated as a Python variable on the REPL. The motivation is for the LM to use Python commands, including commands that call other LM instances, to figure out how best to modify the context at inference time. The results from early testing look impressive at a first glance: An RLM wrapping GPT-5-mini outperforms GPT-5 by a wide margin on long-context tasks, at significant lower cost. I've added this to my reading list. |
|