Hacker News new | ask | show | jobs
by intalentive 480 days ago
What's the value-add here? The schemas?
2 comments

We've seen so many different schemas and ways of prompting the VLMs. We're just standardizing it here, and making it dead-simple to try it out across model providers.
Basically there is no model schema combination. IF you go ahead and prompt a open source model with the schema it doesn't produce the results in the expected format. The main contribution is how to make these model conform to your specific needs and in a structured format.
Wait, but we're doing that already, and it works well (Qwen 2.5 VL)? If need be, you can always resort to structured generation to enforce schema conformity?