|
|
|
|
|
by ccccppppp
2490 days ago
|
|
Thanks for the insight, also for mentioning convolutional LSTM, I wasn't aware such a thing existed. > Attention is basically a memory module, so if you don't need that it's just a waste of compute resources. But aren't CNNs also like a memory module (ie: they memorize how leopard skin looks like)? I guess attention is a more sophisticated kind of memory, "more dynamic" so to speak. Anyway, I'm glad to hear that a transformer architecture isn't totally stupid for my task, I will look up the literature, there seems to be a bit on this matter. |
|