|
|
|
|
|
by curious_cat_163
841 days ago
|
|
Just catching up to this thread again. You had said: "I was wondering if anyone has tried setting importance of a token as a TF-IDF or BM25 lookup." So, I take it back. This is not a confusion. You are right to call it out. :) I like this idea directionally. A lot of energy (literally) would be saved if we could get to the model accuracy outcomes with static weights like this. However, I do think that this (as stated in your original message) would not work as well as transformer or SSM and I explained my reasoning as to why, already. I don't have an empirical proof (not having run the experiment) but if you believe in it, you should try it and share your findings. |
|