Hacker News new | ask | show | jobs
by dartos 512 days ago
> not a lookup table

You can imagine the classic attention mechanism as a lookup table, actually.

Transformers are layers and layers and layers of lookup tables.