Hacker News new | ask | show | jobs
by hervature 845 days ago
Being used as a comparison...

From the abstract:

> Bringing these components together, we are able to build pure CNN architectures without any attention-like operations that are as robust as, or even more robust than, Transformers.