Y
Hacker News
new
|
ask
|
show
|
jobs
by
p1esk
606 days ago
4x seq length expansion doesn’t sound that bad.
1 comments
lechatonnoir
606 days ago
I mean, it's not completely fatal, but it means an approximately 16x increase in runtime cost, if I'm not mistaken. That's probably not worth trying to solve letter counting in most applications.
link
SEGyges
604 days ago
it is not necessarily 16x if you, e.g., decrease model width by a factor of 4 or so also, but yeah naively the RAM and FLOPs scale up by n^2
link