| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by brian_cloutier 1182 days ago

This is clever but seems to work through brute force and my guess is that results should be accompanied by some kind of confidence measure. You're forcing the model to choose a path it wasn't particularly interested in going down.

In the ideal case, such as when you ask for a json array, the first "[" token you enforce will have a relatively high probability and forcing it to go down that path will give you good results.

In the dangerous case the model doesn't have a good structure-compliant completion to your prompt and the regex you supply forces the model into a path of extremely low probability and you get trash results.

2 comments

killthebuddha 1182 days ago

That sounds like a real risk but also the kind of thing you would need to implement a solution for anyways.

It seems like there's two clear paths:

- Allow the model to complete whatever it wants and then anneal the structure into compliance - Force the model into a compliant structure and then anneal the quality

I think both options can make sense in different cases.

One case I'm thinking about where the second option feels simpler is when you want to implement a boolean function using a language model. I'm imagining a probability distribution that looks like:

- (40%) the answer is true - (39%) True - (21%) False

In this case it seems significantly more straightforward to force the model into completing T or F. I guess you then run into the "dangerous case" where you have

- (40%) the answer is false - (39%) True - (21%) False

rckrd 1182 days ago

(author here) That's interesting! Maybe there's a way to quantify the cumulative probability of the squashed tokens (i.e., if you constrain to 'true' and 'false', what's the distribution of the other tokens).

For now, this is a good way to make sure that I can parse the output reliably in the minimal amount of completions (instead of looping until conformant).