Hacker News new | ask | show | jobs
by michelsedgh 778 days ago
I agree with you somewhat. You are correct unless they have a much better GPT model that have not released for whatever reason. They are a year ahead than competitors and GPT4 is pretty old now. I find it hard to believe they don’t have much more capable models now. We Will see though
4 comments

The polish of OpenAI stuff when released has been quite mature since gpt4 or even 3.5.

They are no doubt sitting on ultra polished stuff. When you are the tip of the arrow though and the cutting edge itself it might not be as efficient but does it ever show you things you can’t unsee.

When OpenAI can launch a video thing a day after because it’s ready to go. I am less and less skeptical e dry time they ship because the quality of the first version isn’t sliding back wards even in different areas like video.

Maybe releasing it is strategic, or releasing it also requires supporting it infrastructure wise and then some. That might be a challenge.

My feeling is the next model of an k between may have massive efficiency and performance improvements without having to go quantum with brute forcing it.

Meanwhile others who are following what OpenAI has done seem to be able to optimize it and make it more efficient whether it’s open source or otherwise.

Both are doing important work and I'm not sure I want to see it as a one winner take all game.

The way AI vendors are responding suddenly to another’s launch feels like they are always ready to launch and continue to add functionality to it that could also ship.

It reminds me of when Google spent a billion dollars advertising bing had a billion pages indexed. Google stayed quiet. Then when the money was spent by Microsoft, Google simply added a zero or two to their search page, when they used to list how many pages they have indexed. They were just sitting on it already done, announcing it when it’s to their benefit.

Also, what will the effect of open models be on the LLM provider industry? What effect will Meta’s scorched earth policy of killing markets by releasing very good open models have?

I use LLMs constantly, but no longer in a commercial environment (I am retired except for writing books, performing personal research projects, and small consulting tasks). I now usually turn first to local models for most things: ellama+Emacs is good enough for me to mostly stop using GPT-4+Emacs or GitHub Copilot, the latest open 7B, 8B, 30B models running on my Mac seem sufficient for most of the NLP and data manipulating things I do.

However, it is also fantastic to have long context Gemini, OpenAI APIs, Claude, etc. available when needed or just to experiment with.

GPT-4 is not a single model. The GPT-4 that was released initially a year ago is way worse in benchmarks than the newest versions of it and the original version has been beat by quite a lot of other models by this point.

The newest version of GPT-4 is probably still overall the best model currently, but it is only a few months old, and the picture depends a lot on what benchmarks you are looking at.

E.g. for what we are doing at our company (document processing, etc.) Claude-3 Opus and Gemini-1.5 Pro are currently the better models. The newest GPT-4 even performed worse than a previous version.

So to me it def. seems like the gap is getting smaller. Of course, OpenAI could be coming out with GPT-5 next week and it could be vastly better than all other current models.

There's wide speculation that what will be branded as either GPT-4.5 or GPT-5 has finished pretraining now and is undergoing internal testing for a fairly near-term release.
My speculation is that internally they have much stronger models like Q* but they won’t be able to release them to public even if they want to for lack of compute and safety and other reasons they see probably…
They don't actually care about safety, that's a lie, so compute and business strategy is the only thing stopping them.

SoRA is the same. It's not ready and it's too slow.

I am curious whether this is true - OAI at least has the reputation in the industry of caring the least about safety of the major labs
If they don’t care about safety (or perceived safety), why do they spend so much time lobotomizing models for safety reasons?
market reach e.g. ability to have chat app on iOS (the API is less limited)

public relations, limit the edge case nonsense 'journalists' hype so corporate execs aren't terrorized into avoiding buying

doesn't have to be as smart as it could be, it just has to be smarter than other models, so might as well file down some sharp edges for sake of above

I didn’t say they don’t care about safety, merely that of the big labs they care the least or close to the least
Because of PR reasons. They want to avoid government legislations and pretending that they care helps
> My speculation is that internally they have much stronger models like Q*

People used to speculate the same about Google. Everyone hypes up their “secret, too powerful to release” models. Remember the dude who was convinced that there was a sentient AI in the machine? The light of actual public release tends to expose a lot of the hype.

That would be a reasonable assumption if OpenAI did not already have an established track record of repeatedly re-defining our fundamental expectations of what technology can do.

GPT-4 was already completed and secretly being tested on Bing users in India in mid-2022 (there were even Microsoft forum posts asking about the funny chatbot). Even after heavy quantization and the alignment tax GPT-4 is still the bar to beat. It's been two years and their funding has increased over 10x since then.

Short of a fundamental Hard Problem that they cannot overcome, their internal bleeding edge models can reasonably be assumed to possess significantly greater capabilities.

Honestly I'm pretty puzzled by this mystical fog that hangs over OpenAIs skunkworks projects - don't people leave for other jobs/go to conferences etc.?

I'm surprised that nobody call tell what they infact do or do not have.

Truth tends to take the wind out of hypes sails.

With hundreds of billions on the line for the founders and a whole lot of likely unvested stock options for the employees, it doesnt seem like anyone wants to open up about whats actually going on day to day.

I'm not saying Claude 3 and Gemini are better than GPT4 in every aspect, but those two models can at least perform addition on arbitrarily long numbers, meanwhile GPT4 struggles.