Hacker News new | ask | show | jobs
by Deverauxi 850 days ago
The code isn’t very complicated. You could edit in any vision model you want:

https://github.com/microsoft/UFO/blob/main/ufo/llm/llm_call....

1 comments

I don't have the resources currently but would love to test something like LLaVA out for this, but I won't keep my hopes up.