| HN Mirror

I suppose superficially, for retrieval augmented generation use cases, the more data you have (and the better you are at retrieval), the more useful extra space in the context window is because you put things from retrieval into the context of the model to do the generation. That said, literally everyone has more than enough data to completely dwarf the context window of any of the existing models so it seems true but irrelevant.

One thing about gemini that may be a benefit here is not just the size of the window, but the fact that the context window seems to be better utilised by the model. GPT-4 seems to have a characteristic where the start and end of the context window are much more important to the model than anything in the middle, meaning that if you stuff the window with retrieved data, things in the middle of the context get ignored by the model. Istr that is not the case with gemini, which takes more notice of things in the middle of the window. Maybe attention is all you need.[1]

[1] Sorry for the pun but I couldn't resist. Any case if you search "Perplexity over long sequences" in the gemini 1.5 tech report it explains this effect and shows their results https://storage.googleapis.com/deepmind-media/gemini/gemini_...