| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by jefftk 455 days ago

Claude 3.7 gets it exactly right:

To determine which option cools coffee the most, I'll analyze the heat transfer physics involved. The key insight is that the rate of heat loss depends on the temperature difference between the coffee and the surrounding air. When the coffee is hotter, it loses heat faster. Option 1 (add milk first, then wait):

- Adding cold milk immediately lowers the coffee temperature right away

- The coffee then cools more slowly during the 2-minute wait because the temperature difference with the environment is smaller

Option 2 (wait first, then add milk):

- The hot coffee cools rapidly during the 2-minute wait due to the large temperature difference

- Then the cold milk is added, creating an additional temperature drop at the end

Option 2 will result in the lowest final temperature. This is because the hotter coffee in option 2 loses heat more efficiently during the waiting period (following Newton's Law of Cooling), and then gets the same cooling benefit from the milk addition at the end. The mathematical principle behind this is that the rate of cooling is proportional to the temperature difference, so keeping the coffee hotter during the waiting period maximizes heat loss to the environment.

1 comments

kazinator 455 days ago

That's totally cribbed from some discussion hat occurred in its training.

link

Nevermark 455 days ago

As apposed to humans who all derive the physics of heat transfer independently when given a question like this?

Not picking on you - this brings up something we could all get better at:

There should be a "First Rule of Critiquing Models": Define a baseline system to compare performance against. When in doubt, or for general critiques of models, compare to real world random human performance.

Without a real practical baseline to compare with, its to easy to fall into subjective or unrealistic judgements.

"Second Rule": Avoid selectively biasing judgements by down selecting performance dimensions. For instance, don't ignore difference in response times, grammatical coherence, clarity of communication, and other qualitative and quantitative differences. Lack of comprehensive performance dimension coverage is like comparing runtimes of runners, without taking into account differences in terrain, length of race, altitude, temperature, etc.

It is very easy to critique. It is harder to critique in a way that sheds light.

link

selcuka 454 days ago

> As apposed to humans who all derive the physics of heat transfer independently when given a question like this?

Isn't that the difference between learning and memorizing, though? If you were taught Newton's Law of Cooling using this example and truly learned it, you could apply it to other problems as well. But if you only memorized it, you might be able to recite it when asked the same question, yet still be unable to apply it to anything else.

link

accrual 455 days ago

> It is very easy to critique. It is harder to critique in a way that sheds light.

Well said. This is the sort of ethos I admire and aspire to on HN.

link

mhh__ 455 days ago

So is my knowledge of newtons law of cooling

link

kazinator 455 days ago

If an LLM has only that knowledge and nothing else (pieces of text saying that heat transfer is proportional to some function of the temp difference) such that is not trained on any texts that give problems and solutions in this area, it will not work this out, since it has nothing to generate tokens from.

Also, your knowledge doesn't come from anywhere near having scanned terabytes of text, which would take you multiple lifetimes of full time work.

link

mhh__ 455 days ago

We get way more info than llms do, just not solely from text

link

suddenlybananas 455 days ago

You have not read every accessible piece of text in existence.

link

mhh__ 455 days ago

There is more to life than just text e.g. this is part of lecun argument against LLMs

link

fph 455 days ago

This exact problem was in Martin Gardner's column for Scientific American in the 1970s. There are surely references all over the internet.

link

jonplackett 455 days ago

If it was just ‘in the training data’ they’d all get it right.

But they don’t.

link

kazinator 455 days ago

I don't think that can be postulated as a law, because they are a kind of lossy compression. Different lossy compressions will lose different details.

link