Hacker News new | ask | show | jobs
by fzysingularity 491 days ago
You can always distill VLMs into much smaller / faster models that’s specific to your domain or use-case.

What’s the use-case and what kind of latency do you require?