|
|
|
|
|
by Centigonal
415 days ago
|
|
There's something that tickles me about this paper's title. The thought that everyone should know these three things. The idea of going to my neighbor who's a retired K-12 teacher and telling her about how adding MLP-based patch pre-processing layers improves Bert-like self-supervised training based on patch masking. |
|