Hacker News new | ask | show | jobs
by modeless 331 days ago
I don't know if this is the one but something like this is clearly the future IMO. We need more levels of hierarchy to efficiently generalize to longer sequences with high level structure. Back when Byte Latent Transformers came out I thought extending the idea to more levels of hierarchy was the way to go, and this seems to be basically that?

Another article about H-Nets: https://main-horse.github.io/posts/hnet-inf/

1 comments

Yes... This seems like a generalization of "large concept models" in a certain way