|
|
|
|
|
by astrange
848 days ago
|
|
I sure am ignoring that, because the bitter lesson of AI is usually applicable and implies that all such research will be replaced by larger generic transformer networks as time goes on. The exception is when you care about efficiency (in training or inference costs) but at the limit or if you care about "better" then you don't. |
|