Hacker News new | ask | show | jobs
by Yusefmosiah 849 days ago
I see a lot of talk about retrieval over long context. Some even think this replaces RAG.

I don't care if the model can tell me which page in the book or which code file has a particular concept. RAG already does this. I want the model to notice how a concept is distributed throughout a text, and be able to connect, compare, contrast, synthesize, and understand all the ways that a book touches on a theme, or to rewrite multiple code files in one pass, without introducing bugs.

How does Gemini 1.5's reasoning compare to GPT-4? GPT-4 already has superhuman memory; its bottleneck is its relatively weak reasoning.

2 comments

In my experience (I work mostly and deeply with Bard/Gemini), the reasoning capability of Gemini is quite good. Gemini Pro is already much better than ChatGPT 3.5, but they still make quite a few mistakes along the way. What is more worrying is that when these models made mistakes, they tried really hard to justify their reasoning (errors), practically misleading the users. Because of their high mimicry ability, users really have to pay attention to validate and eventually spot the errors. Of course, this is still far below the human level, so I'm not sure whether they add value or are more of a burden.
The most impressive demonstration of long context is this in my opinion,

https://imgur.com/a/qXcVNOM

Testing language translation abilities of an extremely obscure language after passing in one grammar book as context.