Hacker News new | ask | show | jobs
by duluca 550 days ago
The first computers cost millions of dollars and filled entire rooms to accomplish what we would now consider simple computational tasks. That same computing power now fits into the width of a finger nail. I don’t get how technologists balk at the cost of experimental tech or assume current tech will run at the same efficiency for decades to come and melt the planet into a puddle. AGI won’t happen until you can fit enough compute that’d take several data center’s worth of compute into a brain sized vessel. So the thing can move around process the world in real time. This is all going to take some time to say the least. Progress is progress.
8 comments

I thought you were going to say that now we're back to bigger-than-room sized computers that cost many millions just to perform the same tasks we could 40 years ago.

I of course mean we're using these LLMs for a lot of tasks that they're inappropriate for, and a clever manually coded algorithm could do better and much more efficiently.

> and a clever manually coded algorithm could do better and much more efficiently.

Sure, but how long would it take to implement this algorithm, and would that be worth it for one-off cases?

Just today I asked Claude to create a jq query that looks for objects with a certain value for one field, but which lack a certain other field. I could have spent a long time trying to make sense of jq's man page, but instead I spent 30 seconds writing a short description of what I'm looking for in natural language, and the AI returned the correct jq invocation within seconds.

I don’t think this is a bad use. A bad use would be to give Claude the dataset and ask it to tell you which elements have that value.
Claude answers a lot of its questions by first writing and then running code to generate the results. Its only limitation is the access to databases and size of context window, both of which will be radically improved over the next 5 years.
I would still rather be able to see the code it generates
Ha, I tried that before. However, the file was too large for its context window, so it only seemed to analyze the first part and gave a wrong result.
It was your own data, right ? Becouse you just donated half of it...
It's okay, I also uploaded an NDA in a previous prompt :-)
But how do you know it's given you the correct answer? Just because the code appears to work it doesn't mean it's correct.
But how do I know if my hand-written jq query is the correct solution? Just because the query appears to work it doesn't mean it's correct.
Because I understand the process that I have followed to get to the solution.
It can explain its solution. Point to relevant docs as well.
It can also very convincingly explain a non-solution pointing to either real or hallucinated docs.
You need to look at the docs.
Omg this is how llms used to trick me inventing out all these apis.
Look at the docs it links to.
just ask the LLM to solve enough problems (even new problems), cache the best, do inference time compute for the rest, figure out the best/ fastest implementations, and boom, you have new training data for future AIs
> cache the best

How do you quantify that?

"Assume the role of an expert in cache invalidation..."
"one does not just assume", "because the hardest problems in Tech are Johnny Cash invalidations" --Lao Tzi
> "Those who invalidate caches know nothing; Those who know retain data." These words, as I am told, were spoken by Lao Tzi. If we are to believe that Lao Tzi was himself one who knew, why did he erase /var/tmp to make space for his project?

-- Poem by Cybernetic Bai Juyi, "The Philosopher [of Caching]"

“Assume the role of an expert in naming things. You know, a… what do they call those people again… there must be a name for it”
however you want
The LLMs are now writing their own algorithms to answer questions. Not long before they can design a more efficient algorithm to complete any feasible computational task, in a millionth of the time needed by the best human.
> The LLMs are now writing their own algorithms to answer questions

Writing a python script, because it can't do math or any form of more complex reasoning is not what I would call "own algorithm". It's at most application of existing ones/calling APIs.

LLMs are probabilistic string blenders pulling pieces up from their training set, which unfortunately comes from us, humans.

The superset of the LLM knowledge pool is human knowledge. They can't go beyond the boundaries of their training set.

I'll not go into how humans have other processes which can alter their and collective human knowledge, but the rabbit hole starts with "emotions, opposable thumbs, language, communication and other senses".

> They can't go beyond the boundaries of their training set.

TFA says they just did. That's what the ARC-AGI benchmark was supposed to test.

> take several data center’s worth of compute into a brain sized vessel. So the thing can move around process the world in real time

How so? I'd imagine a robot connected to the data center embodying its mind, connected via low-latency links, would have to walk pretty far to get into trouble when it comes to interacting with the environment.

The speed of light is about three orders of magnitude faster than the speed of signal propagation in biological neurons, after all.

The robot brain could be layered so that more basic functions are embedded locally while higher-level reasonings and offloaded to the cloud.
blue strip from iRobot?
6 orders of magnitude if we use 120 m/s vs 300 km/s
should've said "6 orders of magnitude if we use 120 m/s vs 300_000 km/s" - I was also off by 3 orders of magnitude!
Ah, yes, I missed a “k” in that estimation!
Many of humans' capabilities are pretrained with massive computing through evolution. Inference results of o3 and its successors might be used to train the next generation of small models to be highly capable. Recent advances in the capabilities of small models such as Gemini-2.0 Flash suggest the same.

Recent research from NVIDIA suggests such an efficiency gain is quite possible in the physical realm as well. They trained a tiny model to control the full body of a robot via simulations.

---

"We trained a 1.5M-parameter neural network to control the body of a humanoid robot. It takes a lot of subconscious processing for us humans to walk, maintain balance, and maneuver our arms and legs into desired positions. We capture this “subconsciousness” in HOVER, a single model that learns how to coordinate the motors of a humanoid robot to support locomotion and manipulation."

...

"HOVER supports any humanoid that can be simulated in Isaac. Bring your own robot, and watch it come to life!"

More here: https://x.com/DrJimFan/status/1851643431803830551

---

This demonstrates that with proper training, small models can perform at a high level in both cognitive and physical domains.

> Similarly, many of humans' capabilities are pretrained with massive computing through evolution.

Hmm .. my intuition is that humans' capabilities are gained during early childhood (walking, running, speaking .. etc) ... what are examples of capabilities pretrained by evolution, and how does this work?

If you look at animals, they can walk in hours, not much time needed after being born. It takes us a longer time because we are born rather undeveloped to get the head out of the birth canal.

A more high level example, sea sickness is a evolutionary pre-learned thing, your body things it's poisoned and it automatically wants to empty your stomach.

The brain is predisposed to learn those skills. Early childhood experiences are necessary to complete the training. Perhaps that could be likened to post-training. It's not a one-to-one comparison but a rather loose analogy which I didn't make it precise because it is not the key point of the argument.

Maybe evolution could be better thought of as neural architecture search combined with some pretraining. Evidence suggests we are prebuilt with "core knowledge" by the time we're born [1].

See: Summary of cool research gained from clever & benign experiments with babies here:

[1] Core knowledge. Elizabeth S. Spelke and Katherine D. Kinzler. https://www.harvardlds.org/wp-content/uploads/2017/01/Spelke...

> The brain is predisposed to learn those skills.

Learning to walk doesn't seem to be particularly easy, having observed the process with my own children. No easier than riding a bike or skating, for which our brains are probably not 'predisposed'.

Walking is indeed a complex skill. Yet some animals walk minutes after birth. Human babies are most likely born premature due to the large brain and related physical constraints.

Young children learn to bike or skate at an older age after they have acquired basic physical skills.

Check out the reference to Core Knowledge above. There are things young infants know or are predisposed to know from birth.

The brain has developed, through evolution, very specific and organized structures that allow us to learn language and reading skills. If you have a genetic defect that causes those structures to be faulty or missing, you will have severe developmental problems.

That seems like a decent example of pretraining through evolution.

But maybe it's something more like general symbolic manipulation, and not specifically the sounds or structure of language. Reading is fairly new and unlikely to have had much if any evolutionary pressure in many populations who are now quite literate. Same seems true for music. Maybe the hardware is actually more general and adaptable and not just for language?
> No easier than riding a bike or skating, for which our brains are probably not 'predisposed'.

What makes you think so? Humans came up with biking and skating, because they were easy enough for us to master with the hardware we had.

I think of evolution as unassisted learning where agents compete with the each other for limited resources. Over time they get better and better at surviving by passing on genes. It never ends of course.
Your brain is well adapted to learning how to walk and speak.

Chimpanzees score pretty high on many tests of intelligence, especially short term working memory. But they can't really learn language: they lack the specialised hardware more than the general intelligence.

I mean, there are plenty - e.g. mimicking (say, the mother's face's emotions), which are precursors to learning more advanced "features". Also, even walking has many aspects pretrained (I assume it's mostly a musculoskeletal limitation that we can't walk immediately), humans are just born "prematurely" due to our relatively huge heads. Newborn horses can walk immediately without learning.

But there are plenty of non-learned control/movement/sensing in utero that are "pretrained".

Interestingly, there's a bunch of reflexes that also only develop over time.

They are more nature than nurture, but they aren't 'in-born'.

Just like human aren't (usually) born with teeth, but they don't 'learn' to have teeth or pubic hair, either.

The concern here is mainly on practicality. The original mainframes did not command startup valuations counted in fractions of the US economy, they did qualify for billions in investment.

This is a great milestone, but OpenAI will not be successful charging 10x the cost of a human to perform a task.

The cost of inference has be dropping by ~100x in the past 2 years.

https://a16z.com/llmflation-llm-inference-cost/

Hmm the link is saying the price of an LLM that scores 42 or above on MMLU has dropped 100x in 2 years, equating gpt 3.5 and llama 3.2 3B. In my opinion gpt 3.5 was significantly better than llama 3B, and certainly much better than the also-equated llama 2 7B. MMLU isn't a great marker of overall model capabilities.

Obviously the drop in cost for capability in the last 2 years is big, but I'd wager it's closer to 10x than 100x.

*infernonce
*inference
> OpenAI will not be successful charging 10x the cost of a human to perform a task.

True, but they might be successful charging 20x for 2x the skill of a human.

Or 10x the skill and speed of a human in some specific class of recurrent tasks. We don't need full super-human AGI for AI to become economically viable.
Companies routinely pay short-term contractors a lot more than their permanent staff.

If you can just unleash AI on any of your problems, without having to commit to anything long term, it might still be useful, even if they charged more than for equivalent human labour.

(Though I suspect AI labour will generally trend to be cheaper than humans over time for anything AIs can do at all.)

I wouldn’t expect it to cost 10x in five years, if only because parallel computing still seems to be roughly obeying moore’s.
How much does AWS charge for compute?

If it can be spun up with Terraform, I bet you they could.

Maybe AGI as a goal is overvalued: If you have a machine that can, on average, perform symbolic reasoning better than humans, and at a lower cost, that's basically the end game, isn't it? You won capitalism.
Right now I can ask an (experienced) human to do something for me and they will either just get it done or tell me that they can’t do it.

Right now when I ask an LLM… I have to sit there and verify everything. It may have done some helpful reasoning for me but the whole point of me asking someone else (or something else) was to do nothing at all…

I’m not sure you can reliably fulfill the first scenario without achieving AGI. Maybe you can, but we are not at that point yet so we don’t know yet.

You do need to verify humans work though.

The difference, to me, is that humans seem to be good at canceling each other's mistakes when put in a proper environment.

Not with the same depth. I might ask a friend to drop off a letter and I might verify that they did it, but I don’t have to verify that they didn’t mistake a Taco Bell or a dumpster as the post office.

It’s very scary to ask a friend to drop off a letter if the last scenario is even 1% within the realm of possibility.

My guess is this is an artifact of the RLHF part of the training. Answers like "I don't know" or "let me think and let's catch on this next week" are flagged down by human testers, which eventually trains LLM to avoid this path altogether. And it probably makes sense because otherwise "I don't know" would come up way too often even in cases where the LLM is perfectly able to give the answer.
I don't know, that seems like a fundamental limitation. LLMs don't have any ability to do reflection on their own knowledge/abilities.
Humans aren't very aware of their limits, either.

Even the Dunning-Kruger effect is, ironically, widely misunderstood by people who are unreasonably confident about their knowledge.

But you know if you have ever heard about call by name or value semantics.
Yes, Dunning-Kruger's paper never found what popular science calls the 'Dunning-Kruger' effect.

Effectively, they found nothing real but a statistical artifact.

> Right now I can ask an (experienced) human to do something for me and they will either just get it done or tell me that they can’t do it.

Finding reliable honest humans is a problem governments have struggled with for over a hundred years. If you have cracked this problem at scale you really need to write it up! There are a lot of people who would be extremely interested in a solution here.

> Finding reliable honest humans is a problem governments have struggled with for over a hundred years.

Yes, though you are downplaying the problem a lot. It's not just governments, and it's way longer than 100 years.

Btw, a solution that might work for you or me, presumably relatively obscure people, might not work for anyone famous, nor a company nor a government.

It's not clear to me whether AGI is necessary for solving most of the issues in the current generation of LLMs. It is possible you can get there by hacking together CoTs with automated theorem provers and bruteforcing your way to the solution or something like that.

But if it's not enough then maybe it might come as a second-order effect (e.g. reasoning machines having to bootstrap an AGI so then you can have a Waymo taxi driver who is also a Fields medalist)

There are so called "yes-men" who can't say "no" in no situation. That's rooted in their culture. I suspect that AI was trained using their assistance. I mean, answering "I can't do that" is the simplest LLM path that should work often unless they gone out of their way to downrank it.
Honestly, it doesn't need to be local, API is some 200ms away is ok-ish, make it 50ms it will be practically usable for every majority of interaction.
Batteries..
Intelligence has nothing at all whatever to do with compute.
Unless you're a dualist who believes in a magic spirit, I cannot understand how you think that's the case. Can you please explain?
Philosophy of mind is the branch of philosophy that attempts to account for a very difficult problem: why there are apparently two different realms of phenomena, physical and mental, that are at once tightly connected and yet as different from one another as two things can possibly be.

Broadly speaking you can think that the mental reduces to the physical (physicalism), that the physical reduces to the mental (idealism), both reduce to some other third thing (neutral monism) or that neither reduces to the other (dualism). There are many arguments for dualism but I’ve never heard a philosopher appeal to “magic spirits” in order to do so.

Here’s an overview: https://plato.stanford.edu/entries/dualism/

Dualism has nothing to do with it. There are more things on heaven and earth then just computable functions in the mathematical sense.

(In fact, the very idea of "computable functions" was invented to narrow down the space of "all things" to something much smaller, tighter and manageable. And now we've come full circle and apparently everything in the universe is a computable function? Well, if all you have is a hammer, I guess everything must necessarily look like a nail.)

Intelligence is about learning from few examples and generalising to novel solutions. Increasing compute so that exploring the whole problem space is possible is not intelligence. There is a reason the actual ARC-AGI price has efficiency as one of the success requirements. It is not so that the solutions scale to production and whatnot, these are toy tasks. It is to help ensure that it is actually an intelligent system solving these.

So yeah, the o3 result is impressive but if the difference between o3 and the previous state of art is more compute to do a much longer CoT/evaluation loop, I am not so impressed. Reminder that these problems are solved by humans in seconds, ARC-AGI is supposed to be easy.

Do you think intelligence exists without prior experience? For instance, can someone instantly acquire a skill—like playing the piano—as if downloading it in The Matrix? Even prodigies like Mozart had prior exposure. His father, a composer and music teacher, introduced him to music from an early age. Does true intelligence require a foundation of prior knowledge?
Intelligence requires the ability to separate the wheat from the chaff on one's own to create a foundation of knowledge to build on.

It is also entirely possible to learn a skill without prior experience. That's how it(whatever skill) was first done

> Does true intelligence require a foundation of prior knowledge?

This is the way I think about it.

I = E / K

where I is the intelligence of the system, E is the effectiveness of the system, and K is the prior knowledge.

For example, a math problem is given to two students, each solving the problem with the same effectiveness (both get the correct answer in the same amount of time). However, student A happens to have more prior knowledge of math than student B. In this case, the intelligence of B is greater than the intelligence of A, even though they have the same effectiveness. B was able to "figure out" the math, without using any of the "tricks" that A already knew.

Now back to your question of whether or not prior knowledge is required. As K approaches 0, intelligence approaches infinity. But when K=0, intelligence is undefined. Tada! I think that answers your question.

Most LLM benchmarks simply measure effectiveness, not intelligence. I conceptualize LLMs as a person with a photographic memory and a low IQ of 85, who was given 100 billion years to learn everything humans have ever created.

IK = E

low intelligence * vast knowledge = reasonable effectiveness

Thank you for detailing out your thoughts. This is quite a well detailed out argument.

In your calculations, in relation to humans, how do you view the 500k - 700k years of learned behaviors and acquired responses passed to offspring?

Reducing the broad category of "experience" to "computable functions in the mathematical sense" is quite, hm, reductive.