|
|
|
|
|
by getnormality
51 days ago
|
|
> I think most ML people now think of neural-network architectures as being, essentially, choices of tradeoffs that facilitate learning in one context or another when data and compute are in short supply, but not as being fundamental to learning. Is this a practical viewpoint? Can you remove any of the specific architectural tricks used in Transformers and expect them to work about equally well? |
|