Hacker News new | ask | show | jobs
by fxj 442 days ago
https://ollama.com/joefamous/QVQ-72B-Preview

Experimental research model with enhanced visual reasoning capabilities.

Supports context length of 128k.

Currently, the model only supports single-round dialogues and image outputs. It does not support video inputs.

Should be capable of images up to 12 MP.

1 comments

>Last December, we launched QVQ-72B-Preview as an exploratory model, but it had many issues.

That's an earlier version released some months ago. They even acknowledge it.

The version they present in the blog post and you can run in their chat platform is not open or available to download.