| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by bitexploder 325 days ago
	Don’t sleep on Gemini Deep Research feature either. I use it for my car work and it beats ChatGPT’s offering at that price point every time.

2 comments

losvedir 325 days ago

I dunno, I use Deep Research from Claude, ChatGPT, and Gemini, and Gemini is the only one that ignores my requests and always produces the most inane high school student wannabe management consultant "report" with introduction and restatement of the problem and background and all that. Its "voice" (the prose, I mean, not text to speech) is so irritating I've stopped using it.

The other ones will do the thing I want: search a bunch, digest the results, and give me a quick summary table or something.

link

bluecalm 324 days ago

Gemini is high on hallucination. When I ask it about my own software it not only changes my own name to a similar one common in my language but also makes up stuff about our team saying some stranger works with us (he works in the same niche but that's about it).

It's annoying when it's so confident making up nonsense.

Imo Chat GPT is just a league above when it comes to reliability.

link

bitexploder 324 days ago

I just end up using both for research type things. They both end up doing better on certain topics or types of work. For $20/mo why not both :)

I like ChatGPT as a product more, but Gemini does well on many things that ChatGPT struggles with a little more. Just my anecdotes.

link

Moosdijk 324 days ago

>Imo Chat GPT is just a league above when it comes to reliability.

Which is in my option, the #1 metric an LLM should strive for. It can take quite some time to get anything out of an LLM. If the model turns out to be unreliable/untrustworthy, the value of its output is lost.

It's weird that modern society (in general) so blindly buys in to all of the marketing speak. AI has a very disruptive effect on society, only because we let it happen.

link

astrange 324 days ago

I like Gemini Deep Research because ChatGPT's has very low limits, but it is extremely on rails. Yesterday as an experiment I asked it to do a bunch of math rather than write a report, and it did the math but then wrote a report scolding me for not appreciating the beauty of the humanities.

link

bitexploder 324 days ago

Suppose it depends. I think of it like this article suggests. It is very good at searching and scraping a lot of websites fast. And then summarizing that some.

link

niklassheth 325 days ago

I've found the same, but I also haven't gained much value out of "deep research" products as a whole. When I last tested them with topics I'm familiar with, I found the quality of research to be poor. These tools seem to spend their time searching for as much content as possible, then they dump it all into a report. I get better outcomes by extensively searching for a handful of top quality sources. Most of the time your question (or at least some subquestions) has already been answered by an expert, and you're better off using their work than sloppily recreating it.

link

ghostpepper 325 days ago

This begs the question of what would be required to get an AI chatbot to emulate the process you (and others, including myself) use manually, and whether it's possible purely through different prompting.

Is the fundamental problem that it weights all sources equally so a bunch of non-experts stating the wrong answer will overpower a single expert saying the correct answer?

link

simonw 325 days ago

This post has some interesting suggestions about that: https://open.substack.com/pub/mikecaulfield/p/is-the-llm-res...

link