Hacker News new | ask | show | jobs
by logophobia 1162 days ago
Reminds me of the ideas behind google's multi-axis transformer: https://arxiv.org/abs/2204.01697

Both using a hierarchical transformer, adapting the transformer network architecture to vision tasks more efficiently.