| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by parthsareen 559 days ago
	Hey! Author of the blog here. The current implementation uses llama.cpp GBNF which has allowed for a quick implementation. The biggest value-add at this time was getting the feature out. With the newer research - outlines/xgrammar coming out, I hope to be able to update the sampling to support more formats, increase accuracy, and improve performance.

1 comments

mwieler 558 days ago

Hi, just wanted to say how much I appreciate your work.

I'm curious if you have considered implementing Microsoft's Guidance (https://github.com/guidance-ai/guidance)? Their approach offers significant speed improvements, which I understand can sometimes be shortcoming of GBNF (e.g https://github.com/ggerganov/llama.cpp/issues/4218).

link

parthsareen 557 days ago

Yes! I have checked guidance out, as well as a few others. Planning to refactor sampling in the near future which would include improving using grammars for sampling as well. Thanks for sharing!

link