Hacker News new | ask | show | jobs
by namjh 1066 days ago
Might be related: https://arxiv.org/abs/2211.16421 It's a paper about directly learning via JPEG encodings which works well with visusal transformers' patch mechanism.