|
|
|
|
|
by infecto
974 days ago
|
|
I have been trying to understand the hype as well. Happy to see all the work happening in this space still. I was pretty curious about the context limit. I am not an expert in this area but I always thought the biggest problem was the length of your original text. So typically you might only encode a sentence or a selection of sentences. You could always stuff more in but they you are potentially losing the specificity, I would think that is a function of the dimensionality. This model is 768, are they saying I can stuff 8k tokens worth of text and can utilize it just as well as I have with other models on a per 1-3 sentence level? |
|
This also opens up another question though, how would that compare to using a LLM to summarize that paper and then just embed on top of that summary.