Hey! Author of the blog here. The current implementation uses llama.cpp GBNF which has allowed for a quick implementation. The biggest value-add at this time was getting the feature out.
With the newer research - outlines/xgrammar coming out, I hope to be able to update the sampling to support more formats, increase accuracy, and improve performance.
Yes! I have checked guidance out, as well as a few others. Planning to refactor sampling in the near future which would include improving using grammars for sampling as well. Thanks for sharing!
With the newer research - outlines/xgrammar coming out, I hope to be able to update the sampling to support more formats, increase accuracy, and improve performance.