| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by fdgsdfogijq 1190 days ago
	All these research science bureaucrats at Big Tech could have released LLM models or tried to develop what OpenAI did. But none of them did. We should applaud OpenAI for the innovation and let them do as they please.

1 comments

redox99 1190 days ago

Google (and others) may not have released model weights, but they've published papers, which is ultimately what makes the field advance. OpenAI not only did not publish any GPT4 paper, they haven't even said how many parameters it has.

link

mach1ne 1190 days ago

Indeed Google came up with Transformers and decided to gift the model to humanity. By broad strokes it was luck that OpenAI chose the seemingly right path of AI.

Closest competitor DeepMind played games, which is intuitively closer to what humans do, but its relevance given aspects of deep learning is questionable.

link

teruakohatu 1190 days ago

> DeepMind played games, which is intuitively closer to what humans do, but its relevance given aspects of deep learning is questionable.

Reinforcement Learning is part of what OpenAI is doing. I don't think Google went down the wrong path. If anything they should have run down the path they were on.

link

buildbot 1190 days ago

Then what is this? 99 pages of bullshit? https://arxiv.org/pdf/2303.08774.pdf

link

redox99 1190 days ago

> Given both the competitive landscape and the safety implications of large-scale models like GPT-4, this report contains no further details about the architecture (including model size), hardware, training compute, dataset construction, training method, or similar.

It's 99 pages of marketing material

link

buildbot 1190 days ago

Personally I disagree, there are lot of interesting tidbits in this paper. More than marketing would need at least.

link

whatshisface 1190 days ago

What good bits did you find? (I'm not sure how fruitful the "OpenAI is a Microsoft department" debate is given that they are almost one and everybody knows it, but I am curious if anyone has found anything good in those many pages.)

link

buildbot 1190 days ago

I think the most interesting thing is the their ability to predict performance from loss and on a wide range of tasks using a much smaller model - this lets them fine tune their architecture and hypers, then run a single large training run to get full scale gpt4 - from the paper it sounds like they only trained the large model once, then did a Reinforcement learning with human feedback finetune.

Disclaimer - I work at Microsoft, in AI, and have no internal knowledge about gpt4.

link

thatsadude 1190 days ago

Google published papers but has anybody be able to replicate their results?

link

redox99 1190 days ago

Yes? Attention is all you need and Alpha Zero are the first that come to mind, but there are thousands.

link