| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by zhwu 135 days ago
	The most surprising part: the agent had access to both H100s and H200s. Without being told, it noticed H200s scored better and started screening ideas on H100s, then promoting winners to H200s for validation. That strategy emerged entirely on its own.

3 comments

rogerrogerr 135 days ago

Why do we think this emerged “on its own”? Surely this technique has been discussed in research papers that are in the training set.

link

GorbachevyChase 134 days ago

You probably express very few truly original ideas. Let’s not set the bar quite so high unless we are all just a sad simulacrum of “pure” thought.

link

suddenlybananas 134 days ago

But humans are capable of very many original ideas. Look around you, humans were able to remake the entire world because of these original thoughts.

link

deadbabe 134 days ago

Original ideas are easy if you allow for bad ideas.

link

rullelito 134 days ago

Then "on its own" has no meaning, i.e. everything an LLM does is "on its own".

link

fdghrtbrt 135 days ago

Why surely? Have you never seen an LLM try something new?

link

rogerrogerr 135 days ago

Is your assertion that no one has ever written "we tried some stuff on the small inexpensive platform first, then moved to the bigger more expensive platform with the more promising options" in a research paper or literally anywhere else?

link

fdghrtbrt 135 days ago

No, that's not my assertion. In fact I asserted nothing at all.

link

rogerrogerr 135 days ago

You're speaking in riddles; your communication would be more effective if you didn't do that.

link

fdghrtbrt 134 days ago

You said "surely", and I asked:

> Why surely? Have you never seen an LLM try something new?

I'm afraid I can't make it any simpler than this.

And I still don't know the answer to how you're so sure. To me there's several explanations, and it seems to you there's only one.

I'm pretty happy with my communication style.

link

caconym_ 134 days ago

I honestly don't think I have.

In this case, using a cheap(er) signal or heuristic as an initial filter before spending more resources on cases that pass the filter is a pattern that shows up all over the place, and LLMs are good at picking up on patterns like that and generalizing them. AFAICT.

link

anon291 134 days ago

I'm not sure how people say this so confidently. I have a rather esoteric haskell library that I've written and published for years. ChatGPT and Claude both know about it and frequently help me improve it, and propose completely novel approaches. I'm really not sure how people are so confident that they can't think of anything new. This seems like wishful confirmation bias.

link

caconym_ 134 days ago

> I'm not sure how people say this so confidently.

Say what, exactly?

link

hhh 135 days ago

Why?… The experiment.yaml shows that it is calling h100/200 explicitly, it’s pretty common for humans to say “number bigger more gooder” for anything… Lie and reverse the values and see what happens. I would put money on a rabbit hole of complaining about it being misconfigured.

link

ed 135 days ago

Models are familiar with H100’s. They even predate ChatGPT.

link

Aboutplants 135 days ago

Yeah I thought that was a particularly neat part

link