Hacker News new | ask | show | jobs
by serjester 330 days ago
VLM’s capable of parsing images with high fidelity are 10 - 50X cheaper than the frontier models. Any savings from not parsing, are quickly going to be wiped out if someone has any actual traffic. Not to mention the massive hits to long context accuracy and latency.