|
|
|
|
|
by neonbjb
1165 days ago
|
|
You can make almost anything work in DL if you try hard enough, that doesn't mean it is the correct thing to do. Convolutions have inductive biases which are the cause of many of the problems associated with deep learning over the last 10 years. Researchers don't "love the ViT". They use it because it is simply better in every way, in every application. The only reason convolutions are still used in modern (intelligently designed) ML systems is because it is not known how to build a sparse attention algorithm that achieves 2D and 3D locality and is also compatible with modern accelerators. Swin is an attempt at that, but it is something of a hack. |
|