| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by koochi10 1066 days ago

This article is a little bit of a red hearing. OpenAI is not apple, in the sense that they are not great at building user facing products. They are great at building the world's best AI models. They've known this since the inception of the GPT models, the only way you could have accessed these models is via API.

Later last year, we saw the release of GPT's text-davinci-003, and in an attempt to showcase this new model to the research community, they launched ChatGPT.

I think what we are seeing now is that chatGPT is best when it is close to existing applications. For example what 14 year old is using the chatGPT app vs the Snapchat AI Chat which uses the API internally.

The recent drop in usage could likely be attributed to such preferential shifts, further compounded by the timing of school holidays.

1 comments

uLogMicheal 1066 days ago

Why is it so hard to believe that they have lessened the capability of the chat model? If we had transparency true to the name, we would be able to confirm or deny differences. Now we are stuck in a state of debate / confusion on this.

Even if the capability is returned, OpenAI still needs to overcome the grudge they are creating with users in regards to openness. Many may be waiting to jump ship to more open models.

Elites gaining access to the best models while everyone else gets the censored/delayed rollout in the name of safety needs to stop. OpenAI should rebrand or return to core values. Sure they contribute to open source, but do they contribute their best to open source as originally intended?

link

YeGoblynQueenne 1066 days ago

>> Why is it so hard to believe that they have lessened the capability of the chat model?

One obvious question is: how would they do it? How does one nerf a language model? Train it again with less data, or different hyperparameters, especially chosen to make it worse? Given the costs of training LLMs that sounds like it would need a very strong motivation.

Fine-tune it, or RLHF it so it's doing worse? That's not cheap either, and what would be the benefit justifying the expense? Nerf a model, to achieve what?

Besides I think you're assuming a degree of fine control on LLM training that just isn't there. If it was so easy to control performance, it would also be much easier to train (both pre-train and fine-tune) LLMs, and OpenAI would not be in the dominant position they are right now.

link

koochi10 1066 days ago

What proof do we have that they "lessened the capability", so far I've only seen rumors on Hackernews.

Even if they were malicious what benefit does Openai get from lessening the model to its user's only to give it to "Elites"?

This sounds like a conspiracy theory to me

link

uLogMicheal 1066 days ago

Your response does two things --

Dismisses the countless examples given, from me in this thread/my comment history, and many other people in many threads on this website and Reddit.

Dismisses the pretense that certain people get access to unfiltered models under the guise of conspiracy.

In the sparks of AGI paper from Microsoft, the researcher mentions the differences in private/test models versus the ones prepped for consumers. If you want a hard to ignore visual example, just look at the unicorn they drew for that paper and then look at https://gpt-unicorn.adamkdean.co.uk/.

I hope you can add to these discussions rather than primarily be dismissive. This is not conspiracy, we do not know the intention behind the changes and I am not speculating on the intentions of the actors.

Sparks of AGI: https://arxiv.org/pdf/2303.12712.pdf

link

koochi10 1066 days ago

I agree to disagree with the given "evidence".

The unicorns, in my perspective, don't appear to have had any notable changes. It's interesting though, that we're assessing a language bot based on its ability to generate a drawing. After reviewing the blog post linked, I agree with the author's observation that there don't seem to be any significant alterations in the unicorn.

Indeed, there are numerous instances of developers experimenting with prompt engineering, discovering what methods work best.

However, I find it difficult to regard this as anything more than speculation for now.

link

uLogMicheal 1066 days ago

And there's an army of people that dismiss concerns, discussion, and evidence presented on this topic regardless of what is provided. On subjective topics people can always dispute examples. Some people abuse the courtesy of those that provide such examples, you've further proven that here.

Regardless of if we should benchmark imagery with something that was claimed to be multimodal, can you genuinely not see the difference here?

https://imgur.com/a/Eburq3B

Maybe your internal prompt is primed to disagree regardless of what is presented?

Edit: https://www.youtube.com/watch?v=qbIk7-JPB2c&t=1585s even mentions what you claim not to see. Safety degrades the model.

link