|
|
|
|
|
by tuchsen
1168 days ago
|
|
This looks really interesting. I recently did something similar to this at https://prlang.com. Although my DSL is much simpler, and mostly focused on chaining multiple prompts together and handling their results, like a declarative langchain. LMQL seems to allow for fine grained control of prompts as the results are generated. The constraint system is super interesting. I'm going to have to read the paper, I'm interested in how they're achieving token costs savings with a system that looks like it should increase it. |
|
LMQL`s efficiency gains can be attributed to close supervision of the generation process, as the token masking via constraints is directly integrated into the decoding loop and happens on the token level. Compared to text-based high-level APIs, this means you can save a bunch of useless continuations the LM will produce, that further down the pipeline you have to discard, as you may want to enforce constraints, insert some follow-up instruction, or tool execution result.