| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by darkteflon 1068 days ago
	Thanks, all good points that would seem to make this library a good fit for certain use-cases. As with the other poster, I’d be interested to hear a bit more about point 1.

2 comments

sandkoan 1068 days ago

https://news.ycombinator.com/item?id=36753254

Does this help clarify?

link

darkteflon 1067 days ago

Got it, thanks. Certainly a very interesting and active space. I was playing around with FLARE (https://arxiv.org/abs/2305.06983) for RAG this week, and LMQL (mentioned by another poster) seems to use a similar technique.

link

darkteflon 1067 days ago

In response to your sister comment: the implementation we used was the naive one from LangChain (https://python.langchain.com/docs/modules/chains/additional/...). We've decomposed that to use as a starting point but early results are promising, yes, although it doesn't yet seem to be possible to get the necessary `logprobs` out of the GPT-4 API, so we're stuck with 3.5-turbo atm.

link

sandkoan 1067 days ago

Ahh, I've been meaning to try FLARE—was it a marked improvement over traditional RAG?

link

jacky2wong 1068 days ago

Point 1 doesn't feel like a good enough reason. The number of tokens outputted as a JSON is so small if you tell GPT to output it properly.

link

sandkoan 1068 days ago

Costs add up surprisingly quickly. A quote-colon-space-quote combo alone is four tokens wasted. Now scale that up....

link

agroot12 1067 days ago

Using tiktokenizer, these are only two tokens: quote-colon is token 498, space-quote is token 330 (as per https://tiktokenizer.vercel.app/ ). But I agree to the general argument.

I think what factors in even more when you use the API is that you do not have fine-grained control over the generation process. If you follow the MS guidance approach, you fill in structured text yourself, and then let the model generate only the value parts, e.g. up to the next quote. To do that more or less word by word, you have multiple API calls, and have to be very smart about providing the right stop tokens.

link