Hacker News new | ask | show | jobs
by ernesto95 560 days ago
Interesting that the results can be so different for different people. I have yet to get a single good response (in my research area) for anything slightly more complicated than what a quick google search would reveal. I agree that it’s great for generating quick functioning code though.
8 comments

> I have yet to get a single good response (in my research area) for anything slightly more complicated than what a quick google search would reveal.

Even then, with search enabled it's ways quicker than a "quick" google search and you don't have to manually skip all the blog-spam.

Google search was great when it came out too. I wonder what 25 years of enshittification will do to LLM services.
Enshittification happened but look at how life changed since 1999 (25 years as you mentioned). Songs in your palm, search in your palm, maps in your palm or car dashboard, live traffic rerouting, track your kids plane from home before leaving for airport, book tickets without calling someone. WhatsApp connected more people than anything.

Of course there are scams and online indoctrination not denying that.

Maybe each service degraded from its original nice view but there is an overall enhancement of our ability to do things.

Hopefully the same happens over next 25 years. A few bad things but a lot of good things.

I think I had most or all of that functionality in 2009, with Android 2.0 on the OG Motorola Droid.

What has Google done for me lately?

But also what new tools will emerge to supplant LLMs as they are supplanting Google? And how good will open source (weights) LLMs be?
absurd, the claim that Google search was better 25 years ago than today. that's vastly trivializing the amount of volume and scale that Google needs to process
I'm using it to aide in writing pytorch code and God if it's awful except for the basic things. It's a bit more useful in discussing how to do things rather than actually doing them though, I'll give you that
Claude is much better at coding and generally smarter; try it instead.

o1-preview was less intelligent than 4o when I tried it, better at multi-step reasoning but worse at "intuition". Don't know about o1.

o1 seems to have some crazy context length / awareness going on compared to current 3.5 Sonnet from playing around it just now. I'm not having to 'remind' it of initial requirements etc nearly as much.
I gave it a try and o1 is better than I was expecting. In particular the writing style is a lot lighter on "GPTisms". It's not very willing to show you its thought process though, the summaries of it seem to skip a lot more than in the preview.
I think the human variable is that you need to know enough to be able to ask the right questions about a subject while not knowing enough about the subject to learn something from the answers.

Because of this, I would assume it is better for people who have interest with more breadth than depth and less impressive to those who have interest that are narrow but very deep.

It seems obvious to me the polymath gains much more from language models than the single minded subject expert trying to dig the deepest hole.

Also, the single minded subject expert is randomly at the mercy of what is in the training data much more in a way than the polymath when all the use is summed up.

I have the $20 version, I fed it code form a personal project, and it did a commendable job of critiquing it, giving me alternate solutions and then iterating on those solutions. Not something you can do with Google.

For example, ok, I like your code but can you change this part to do this. And it says ok boss and does it.

But over multiple days, it loses context.

I am hoping to use the 200$ version to complete my personal project over the Christmas holidays. Instead of me spending a week, I maybe will spend 2 days with chatgpt and get a better version than I initially hoped to.

For code review maybe, it's pretty useful.

Even with the $20 version I've lost days of work because it's told me ideas/given me solutions that are flat out wrong or misleading but sound reasonable, so I don't know if they're really that effective though.

Have you used the best models (i.e. ones you paid for)? And what area?

I've found they struggle with obscure stuff so I'm not doubting you just trying to understand the current limitations.

Try turn search on in ChatGPT and see if it picks up the online references? I've seen it hit a few references and then get back to me with info summarised from multiple. That's pretty useful. Obviously your case might be different, if it's not as smart at retrieval.
My guess is that it has more to do with the person than the AI.
It has a huge amount to do with the subject you're asking it about. His research area could be something very niche with very little info on the open web. Not surprising it would give bad answers.

It does exponentially better on subjects that are very present on the web, like common programming tasks.

How do you get Google search to give useful results? Often for me the first 20 results have absolutely nothing to do with fhe search query.