Hacker News new | ask | show | jobs
by logophobia 1160 days ago
I've applied the S4 operator to successfully do long-length video classification. It's massively more efficient than a similarly scaled transformer, but it doesn't train as well. Still, even with S4 I got some impressive results, looking forward to more.