Hacker News new | ask | show | jobs
by prithvi24 730 days ago
> Researchers inside the AI unit have told colleagues they’re proud of their advances on Gemini, such as its “context window,” the amount of information the system can analyze at once. This is particularly useful to a company whose enormous amount of data is one of its key competitive advantages.

what does a large context window have anything to do with google's data moat?

3 comments

It's the sort of careless, superficially meaningful statement that an LLM would make...
My reading is that the large context window makes Gemini useful as a tool to Google.
I suppose superficially, for retrieval augmented generation use cases, the more data you have (and the better you are at retrieval), the more useful extra space in the context window is because you put things from retrieval into the context of the model to do the generation. That said, literally everyone has more than enough data to completely dwarf the context window of any of the existing models so it seems true but irrelevant.

One thing about gemini that may be a benefit here is not just the size of the window, but the fact that the context window seems to be better utilised by the model. GPT-4 seems to have a characteristic where the start and end of the context window are much more important to the model than anything in the middle, meaning that if you stuff the window with retrieved data, things in the middle of the context get ignored by the model. Istr that is not the case with gemini, which takes more notice of things in the middle of the window. Maybe attention is all you need.[1]

[1] Sorry for the pun but I couldn't resist. Any case if you search "Perplexity over long sequences" in the gemini 1.5 tech report it explains this effect and shows their results https://storage.googleapis.com/deepmind-media/gemini/gemini_...