Y
Hacker News
new
|
ask
|
show
|
jobs
by
karmasimida
622 days ago
I mean it doesn’t necessarily needs 2x QK to match that performance, in terms of accuracy, of a regular transformer right?