Hacker News new | ask | show | jobs
by Udo 1139 days ago
I got the impression that they gimped 3.5 severely over time. Since 4 is still restricted to 25 messages / 3 hours (for paying customers!), I sometimes fall back on 3.5. Of course it's impossible to prove, but it feels like it's failing hilariously at tasks it could do easily a few months ago.

I wonder if more people have this suspicion or if it's just my imagination?

4 comments

I would recommend signing up to GPT-4 API access and, upon hopefully getting it, using a third-party frontend like https://bettergpt.chat/ rather than the official ChatGPT page.

You'll never hit a ratelimit as far as I can tell, and it's usage-based so it will probably come out cheaper than $20/mo for regular usage.

I definitely noticed a drop in quality when the gimped (but presumably dramatically cheaper to run) GPT-3.5-turbo model was introduced on the free version. As a paying subscriber I think you should still have access to the original GPT-3.5 (as "Legacy"), have you compared them?
I don't think there are two versions of GPT-3.5. it seems all just code-davinci-002 with fine tuning on top.

https://platform.openai.com/docs/model-index-for-researchers

It could be from training if more to be safer. This was noted by Microsoft early on with GPT-4. Specifically, when looking at the tikz unicorn qualitative benchmark, the unicorn got better with more epochs, which is obviously expected.

However, very interestingly, the unicorn image got far worse when they trained the model to be safer by trying to correct discrimination against various demographics.

This isn't very intuitive to me why that may occur, and seems to conflict with what has been shown in ROME, etc. So I'm surprised it hasn't been commented upon more. It's certainly one of the best examples of how we don't understand what's going on with these models, and it causes very unexpected outcomes.

The tl;dr that I recall is that the current 3.5 sacrifices some things for efficiency. The older 3.5 that's being phased out (but I think is still accessible in the UI) was the original one which I assume was too expensive or risky to keep running as-is.

I didn't copy a reference, just been reading AI topics on HN lately.