Hacker News new | ask | show | jobs
by bodyfour 1177 days ago
> If you don't consider compute cost [...]

Yes, but what if you do? Imagine your hyper-specialzied API-heavy model takes 10x less resources to answer a question (or at least a question relevant to the task at hand) Won't it be more powerful to have a model that can run 10 times as fast (or run 10 instances in parallel)?

What if the ratio turns out to be 100x or 1000x?

So I agree that the cutting edge of "best possible AGI" might mean building the largest models we can train on massive clusters of computers and then run on high-end hardware. My hunch, though, is that models that can be run on cheap hardware and then "swarmed" on a problem space will be even more powerful in what they can perform in aggregate.

Again, it's just my hunch but right now I think everybody's predictions are hunches.

I'll actually go one bit further: even for a linear task that can't be "swarmed" in the same way, it could be that cheaper-per-token models could even do better on linear problem-solving tasks. Existing models already have the ability to use randomness to give more "creative", if less reliable, answers. This is inherently parallelizable though -- in fact Bard seems to be exposing this in its UI in the form of multiple "drafts". So what if you just ran 100 copies of your cheap-AI against a problem and then had one cheap-AI (or maybe a medium-AI) judge the results?

Or at the risk of a getting too anthropomorphic about it: imagine you as a human are writing a program and you get stuck on a tricky bit -- you know that the problem should be solvable but you've never doing anything similar and don't know what algorithm to start with. Suppose then you could tell your brain "Temporarily fork off 100 copies of yourself. 10 of them go do a literature review of every CS paper you can find related to this topic. 10 of you search for open source programs that might have a similar need and try to determine how their code does it. The other 80 of you just stare off into the middle distance and try to think of a creative solution. In two human-seconds write a summary of your best idea and exit. I'll then read them all and see if I/we are closer to understanding what to do next"

For us, this type of mental process is so alien we can't even imagine what it would feel like to be able to do. It might come completely natural to an AI, though.