Hacker News new | ask | show | jobs
by behnamoh 843 days ago
I guess the argument is that most AI research is supported by the big tech, and they have heavily invested in the deep learning approach.

If the fundings were funneled to research groups working on alternative approaches, maybe we'd see the same amount of progress in AI only using another approach.

3 comments

As a member of the research community: that's nonsense. Like already pointed out: academic groups (who by no means are dependent on big tech) would jump all over that. Mamba has been out long enough that you'd already see tons of papers at arxiv showing mamba dominating transformers in all sorts of applications. But that's not happening, despite the ton of hype. That doesn't mean that mamba is nonsense. Just that it isn't the immediate transformer killer. It remains to be seen if something comes from it, eventually.
As a member of the research community: that's nonsense. Publishing is an extremely noisy process in ML and is getting increasingly difficult for smaller non big tech collaborating labs. Reviewers' go to are: more datasets, scale, not novel. The easiest way to approach this is to work off of pretrained models. This is probably more obvious in the NLP world.

I agree that Mamba doesn't solve everything and it still needs work. But I disagree with the logic that there isn't an issue of railroading.

What’s the main difference between an ape’s brain and a human brain? Scale. So that’s the train we’re riding at the moment. No roadblocks yet, aside from cost.
> What’s the main difference between an ape’s brain and a human brain? Scale.

This is incredibly naive with absolutely no scientific basis. There is no evidence that this is in scale of data nor scale of architecture.

There are a number of animals with larger brains in terms of both mass and total number of neurons. An African Elephant has roughly 3x the number of neurons humans have. Dolphins beat humans in total surface area. Neanderthals are estimated to have had larger brains too! It isn't mass, neurons, neuron density, surface area. We aren't just scaled up chimps.

Other animals with larger brains might have other bottlenecks preventing them from reaching full potential of their intelligence. Neanderthals might have been smarter than us, but went extinct for reasons not related to intelligence.

But my point stands - our brains have evolved directly from apes brains and the main difference between them and us is brain size.

> Other animals with larger brains might have other bottlenecks

>>> What’s the main difference between an ape’s brain and a human brain? Scale.

Your argument is inconsistent. Very clearly everything isn't scale or we'd use other things besides transformers. Different architectures scale in different ways and everything has different inductive biases. No one doubts scale is important, but there's a lot more.

Scary how someone can be so confident in their wrong information.

> An African Elephant has roughly 3x the number of neurons humans have.

An African elephant's brain is not a scaled up chimp brain in any way. African elephants have less cortical neurons than a chimp, and roughly a third of the amount that humans have.

> Dolphins beat humans in total surface area.

Animals even less related to humans and chimps, with no prehensile appendages, living in an environment where building stuff is exceedingly difficult. And of course their brains are obviously different from any great ape.

> Neanderthals are estimated to have had larger brains too!

And were just as smart as us and also had a scaled up chimp brain.

animal :: cortical neurons (b) :: total neurons (b)

Human :: 16 :: 86

Gorilla :: 9.1 :: 33

Chimp :: 6 :: 22

African Elephant :: 5.6 :: 251

Chimps are generally considered more intelligent than gorillas.

Bottlenose Dolphins have 11-15b cortical neurons while humans are in the range 14-18 (range is measurement uncertainty). It's also worth noting these dolphins have a larger brain mass (1.6 kg) and larger cortical surface (3700 cm2) than humans (1.3 kg and 2400 cm2, respectively).

> with no prehensile appendages, living in...

So more than scale. Glad we agree. Seems you also agree that architecture matters too.

> Just that it isn't the immediate transformer killer.

What is the best/stable-ish linear alternative for transformer right now? Especially for text generation and summarization.

We have domain specific ways of over sampling and search, so we much prefer less expensive models.

As someone who's worked at several NVIDIA competitors, including Groq, I can guarantee you that, based on my knowledge, of existing products, they would be able to make much more money should they have lower memory footprint models. Given the amount of VC capital deployed for this (on the order of 100s of millions), I don't believe this is a reasonable take.

Sure, NVIDIA et al may not want that (although, again I don't see why... they too can't produce chips fast enough so being able to provide models for customers now ought to be good), but there's so much money out there that does...

Why would Meta, Microsoft, Amazon and Google want Nvidia to remain dominant in hardware? Are you treating “big tech” like they all have one hive mind?
For MSFT, AMZN, GOOG, the competitive advantage comes from having huge datasets (that Nvidia doesn't have). It's a symbiosis that benefits the data-rich and GPU-rich players.
This still makes little sense as that scale will always matter. If you can drop the compute cost of a model by 10x it means you can increase model integrity/intelligence/speed etc beyond what your compute bound competitors have.

Simply put, for the time being huge datasets are going to be needed and those with bigger (cleaner?) datasets will have a better behaving model.

Where is the symbiosis? If data is the differentiator, how do the data owners benefit from Nvidia eating into their margins?
It's sumbiosis based on a common factor they both appreciate:

- the data/processing having to be large means the data-owners have a benefit

- the data/processing having to be large means NVIDIA also has a benefit (sells more GPUs to handle all that load)

Data owners are benefitting from having access to data while others don't.

High processing cost is not a benefit to them at all. It's just a cost eating into their margins.