Hacker News new | ask | show | jobs
by yggdrasil_ai 260 days ago
>extension to transformers that can focus the attention on just the relevant context.

That is what transformers attention does in the first place, so you would just be stacking two transformers.