| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by rsweeney21 919 days ago
	It's still strange to me to work in a field of computer science where we say things like "we're not exactly sure how these numbers (hyper parameters) affect the result, so just try a bunch of different values and see which one works best."

14 comments

TacticalCoder 919 days ago

> "we're not exactly sure how these numbers (hyper parameters) affect the result, so just try a bunch of different values and see which one works best."

Isn't it the same for anything that uses a Monte Carlo simulation to find a value? At times you'll end up on a local maxima (instead of the best/correct) answer, but it works.

We cannot solve something used a closed formula so we just do a billion (or whatever) random samplings and find what we're after.

I'm not saying it's the same for LLMs but "trying a bunch of different values and see which one works best" is something we do a lot.

r3trohack3r 919 days ago

I feel like it's the difference between something that has been engineered and something that has been discovered.

I feel like most of our industry up until now has been engineered.

LLMs were discovered.

herval 918 days ago

LLMs were very much engineered... the exact results they yield are hard to determine since they're large statistical models, but I don't think that categorizes the LLMs themselves as a 'discovery' (like say Penicilin)

baq 918 days ago

There’s an argument that all maths are discovered instead of invented or engineered. LLM hardware certainly is hard engineering but the numbers you put in it aren’t, once you have them; if you stumbled upon them by chance or they were revealed to you in your sleep it’d work just as well. (‘ollama run mixtral’ is good enough for a dream to me!)

SkyMarshal 919 days ago

If the Black Swan model of science is true, then most of the consequential innovations and advances are discovered rather than engineered.

arketyp 919 days ago

I understand your distinction, I think, but I would say it is more engineering than ever. It's like the early days of the steam engine or firearms development. It's not a hard science, not formal analysis, it's engineering: tinkering, testing, experimenting, iterating.

peddling-brink 919 days ago

> tinkering, testing, experimenting, iterating

But that describes science. http://imgur.com/1h3K2TT/

amelius 919 days ago

AI requires a lot of engineering. However, the engineering is not what makes working in AI interesting. It's the plumbing, basically.

mejutoco 918 days ago

I believe, from what I saw in Mathematics, this is a matter of taste. Discovered or invented are 2 perspectives. Some people prefer to think that light is reaching in previously dark corners of knowledge waiting to be discovered(discover). Others prefer to think that by force of genius they brought the thing into the world.

To me, personally, these are 2 sides of the coin, without one having more proof than the other.

justanotheratom 919 days ago

and finally, this justifies the "science" in Computer Science.

SkyMarshal 919 days ago

That bottom-up tinkering is kinda how CS started in the US, as observed by Dijkstra himself: https://www.cs.utexas.edu/users/EWD/transcriptions/EWD06xx/E...

Ideally we want theoretical foundations, but sometimes random explorations are necessary to tease out enough data to construct or validate theory.

CamperBob2 919 days ago

This can be laid at the feet of Minsky and others who dismissed perceptrons because they couldn't model nonlinear functions. LLMs were never going to happen until modern CPUs and GPUs came along, but that doesn't mean we couldn't have a better theoretical foundation in place. We are years behind where we should be.

When I worked in the games industry in the 1990s, it was "common knowledge" that neural nets were a dead end at best and a con job at worst. Really a shame to lose so much time because a few senior authority figures warned everyone off. We need to make sure that doesn't happen this time.

spidersenses 919 days ago

What is the point you're trying to make?

CamperBob2 919 days ago

What is the point you're trying to make?

Answering the GP's point regarding why deep learning textbooks, articles, and blog posts are full of sentences that begin with "We think..." and "We're not sure, but..." and "It appears that..."

What's yours?

UberFly 919 days ago

This is what researching different Stable Diffusion settings is like. You quickly learn that there's a lot of guessing going on.

fierro 919 days ago

we have no theories of intelligence. We're like people in the 1500s trying to figure out why and how people get sick, with no concept of bacteria, germs, transmission, etc

thatguysaguy 919 days ago

I haven't seen this key/buzzword mentioned yet, so I think part of it is the fact that we're now working on complex systems. This was already true (a social network is a complex system), but now we have the impenetrability of a complex system within the scope of a single process. It's hard to figure out generalizable principles about this kind of thing!

manojlds 919 days ago

Divine benevolence

FuckButtons 918 days ago

I mean, it’s kind of in the name isn’t it? Computer science. Science is empirical, often poorly understood and even the best theories don’t fully explain all observations, especially when a field gets new tools to observe phenomena. It takes a while for a good theory to come along and make sense of everything in science and that seems like more or less exactly where we are today.

jncfhnb 918 days ago

Not strange at all. This is largely how biology operates. These things are simpler than bio and more complex than programs

amelius 919 days ago

AI is more like gardening than engineering. You try things without knowing the outcome. And you wait a very long time to see the outcome.

raxxorraxor 918 days ago

Welcome to engineering. We don't sketch our controlled systems and forget all about systems theory. Instead we just fiddle with out controllers until the result is acceptable.

stormfather 919 days ago

It's how God programs

jejeyyy77 919 days ago

it's a new paradigm