Hacker News new | ask | show | jobs
by kgeist 850 days ago
So the questions should come before the content and it might work?

I think that's how also RWKV works.

1 comments

It's known to help although I wouldn't expect it to be perfect recall unless the network is big enough.

The network will read the data token by token. So if you put the question at the beginning it will know what information it needs to pay attention to inside the rest of your context. Of course, if the network is too small, it still won't be perfect recall for a sufficiently complicated/large question/context.