Hacker News new | ask | show | jobs
by mfkasim1 1000 days ago
The author of the paper here. The cubic time and quadratic space complexity is with respect to the number of dimensions (n), not the sequential time steps. The time and space complexity w.r.t. the number of time steps (L) are both linear, specifically O(Ln^3) time and O(Ln^2) space. The method gives larger speed up with longer sequence (>1k time steps) but with relatively small number of dimensions (<= 64 dimensions from our experiments).
2 comments

Thank you for posting on HN and clarifying that. I was so off-the-mark! I'm used to the convention in most deep-learning papers of using N for time steps and D for number of dimensions.

Your work looks more interesting to me now, even though cubic time and quadratic space in the number of dimensions are still a significant drag.

Consider: State-of-the-art models often work with dimensions 2-3 orders of magnitude greater than 64. For example, LLaMA 2 models operate on visible and hidden states on 4096 and 11008 dimensions, respectively.

Anyway, thank you again! I'm adding your paper to my reading list.

Yes, sorry for that. I'm coming from physics, so L for sequence length and n for the number of dimensions make more sense :D

I agree with the cubic time and quadratic space is a big limitation for now and I'm looking for ways to make them linear (or close to linear).

> I'm coming from physics

I figured as much :-) because using n and m for vector and linear map dimensions is actually the older, more established convention.

> I'm looking for ways to make them linear (or close to linear)

The holy grail in AI research right now. I imagine you're looking, or have looked, at mapping models to frequency space to make complexity O(n log n). Take a look at the work the Hazy Research folks have been doing at Stanford -- if you haven't already.

Can you share a link to your code repository? I want to play with your algorithm and observe the speedup on my own. Also I'm a big fan of JAX!
We're preparing (i.e. cleaning up) the code for the repo. Will update you when we release the code.