Hacker News new | ask | show | jobs
by anon373839 466 days ago
A bit of a tangent, but aren’t CNNs still dominating over ViTs among computer vision competition winners?
1 comments

I haven't watched that space very closely but IMO ViTs have a great potential to extract from since in comparison to CNNs they allow the model to learn and understand complex relations in the data. Where this matters, I expect it to matter a lot. OCR I think is not the greatest such example - while it matters to understand the surrounding context, I think it's not that critical for performance.