Hacker News new | ask | show | jobs
by curious_cat_163 841 days ago
Just catching up to this thread again. You had said:

"I was wondering if anyone has tried setting importance of a token as a TF-IDF or BM25 lookup."

So, I take it back. This is not a confusion. You are right to call it out. :)

I like this idea directionally. A lot of energy (literally) would be saved if we could get to the model accuracy outcomes with static weights like this.

However, I do think that this (as stated in your original message) would not work as well as transformer or SSM and I explained my reasoning as to why, already. I don't have an empirical proof (not having run the experiment) but if you believe in it, you should try it and share your findings.