Y
Hacker News
new
|
ask
|
show
|
jobs
by
knuppar
380 days ago
One could argue TF-IDF is a case of an attention layer... but not quadratic in inference/training and kinda just a quotient. Yeah maybe we should go back