Hacker News new | ask | show | jobs
by zerobees 2 hours ago
Right, and that's what I find frustrating. There are so many use cases where a local, purpose-built model that's dependably good at one thing would really make a difference. But no one is going to throw a billion dollars to give us amazing dust removal, flawless scene segmentation, etc.

Instead, you're supposed to upload it to the cloud and ask a big, multimodal frontier model to maybe please do the thing you want and nothing else.

1 comments

The highest return small local model for me has been the in-built OCR that macOS has. It has finally "solved" OCR by making high-quality results accessible to everyone. Yet the state of art outside the apple ecosystem seems to be tesseract (poor results), or extremely heavy VLMs.