| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by godelski 540 days ago

I have a simple answer for you: most people write garbage code[00]. I know this because I write garbage code but usually less garbage.

A little background...

I got my undergrad in physics (where I fell in love with math), spent some years working, and got really interested in coding and especially ML (especially the math). So I went to grad school. Unsurprising I had major impostor syndrome being surrounded by CS people[0], so I spent a huge amount of time trying to fill in the gap. The real problem was that many of my friends were PL people, so they're math heavy and great programmers. But after teaching a bunch of high level CS classes I realized I wasn't behind. After having to fix a lot of autograders made by my peers, I didn't feel so behind. When my lab grew and I got to meet a lot more ML people, I felt ahead, and confused. I realized the problem: I was trying to be a physicist in CS. Trying to understand things at very fundamental levels and using that to build up, not feeling like I "knew" a topic until I knew that chain. I realized people were just saying they knew at a different threshold.

Back to ML:

Working and researching in ML I've noticed one common flaw. People are ignoring details. I thought HuggingFace would be "my savior" where people would see that their generation outputs weren't nearly the quality you see in papers. But this didn't happen. We cherry picked results and ignored failures. It feels like writing proofs but people only look at the last line (I'd argue this is analogous to code! It's about so much more than the output! The steps are the thing that matters).

So there's two camps of ML people now: the hype people and "the skeptics" (interestingly there's a large population of people with physics and math backgrounds here). I put the latter in quotes because we're not trying to stop ML progress. I'd argue we're trying to make it! The argument is we need to recognize flaws so we know what needs to be fixed. This is why Francois Chollet made the claim that GPT has delayed progress towards AGI. Because we are doing the same thing that caused the last AI winter: putting all our eggs in one basket. We've made it hard to pursue other ideas and models because to get published you need to beat benchmarks (good luck doing so out of the gate and without thousands of GPUs). Because we don't look at the limitations in benchmarks. Because we don't even check for God damn information spoilage anymore. Even HumanEval is littered with spoilage, and obviously so...

There's tons of uses for LLMs and ML systems. My "rage" (as with many others) is more about over promising. Because we know if you don't fulfill those promises quickly, sentiment turns against you and funding quickly goes away. Just look at how even HN went from extremely positive on AI to a similar dichotomy (though the "skeptics" are probably more skeptical than researchers.[1]). Is playing with fire. Prometheus gave it to man to enlighten themselves but they also burned themselves quite frequently.

The answer is:

you evaluate in more detail than others.

[00] of course it is. LLMs replicate average human code. They're optimizers. They optimize fitting data, not fitting optimal symbolic manipulation. If everyone was far better at code, LLMs would be too. That's how they work

[0] boy, us physicists have big egos but CS people give us a run for the money

[1] I have no doubt that AGI can be created. I have no doubt we humans can make it. But I highly doubt LLMs will get us there and we need to look in other directions. I'm not saying we shouldn't stop perusing LLMs, I'm saying don't stop the other research from happening. It's not a zero sum game. Most things in the real world are not (but for some god damn reason we always think it is)