|
|
|
|
|
by sandkoan
1072 days ago
|
|
The prompt is given to our model as a guiding aid (a suggestion), and the cfg is used to constrain the model to generate only tokens that abide by the schema (an enforcement). That's how we ensure only valid outputs at text generation time. We also prefill some tokens depending on the set of allowed tokens at a given state, so the model doesn't waste resources trying to predict them. |
|