|
|
|
|
|
by 0x20cowboy
2 days ago
|
|
> There's no 'rigorous comparison' that puts CNNs over Vits That’s not accurate. My team wrote a paper for school in which a resnet model out performed a ViT model of the same size on almost all metrics. These were smaller models, but depending on the use case that might be what you want. |
|
- Tuning hyperparameters to gain improvement on a dataset when you're constantly looking at the answers is pretty meaningless. It's basically testing on the training data.
- Eval on ImageNet1k alone (very small, useless for the real world) made me wonder if it wasn't just overfit to the training set. Would it perform better training on the datasets used for the foundation models ? I doubt it.
Well I'm not saying CNNs are bad or useless at any rate.