| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by necroforest 895 days ago
	> We arent going to see more progress until we have a way to generalize the compute graph as a learnable parameter That's a bold statement since a ton of progress has been made without learning the compute graph.

3 comments

nomel 895 days ago

From my naive perspective, there seems to be a plateau, that everyone is converging on, somewhere between ChatGPT 3.5 and 4 level of performance, with some suspecting that the implementation of 4 might involve several expert models, which would already be extra sauce, external to the LLM. This, combined with the observation that generative models converge to the same output, given the same training data, regardless of architecture (having trouble finding the link, it was posted here some weeks ago), external secret sauce, outside the model, might be where the near term gains are.

I suppose we'll see in the next year!

manojlds 895 days ago

We already have competitors to Transformers

https://arxiv.org/abs/2312.00752

heyoni 895 days ago

Where do I enter in my credit card info?

taneq 895 days ago

You hire people to implement a product based on this?

uoaei 895 days ago

A ton of progress can be made climbing a tree, but if your goal is reaching the moon it becomes clear pretty quickly that climbing taller trees will never get you there.

nethi 895 days ago

True, but it is the process of climbing trees that gives the insight whether taller trees help or not and if not, what to do next.

mrguyorama 894 days ago

Not true. Climbing trees for millions of years taught us nothing about orbits, or rockets, or literally incomprehensible to human distances, or the vacuum of space, or any possible way to get higher than a tree.

We eventually moved on to lighter than air flight, which once again did not teach us any of those things and also was a dead end from the "get to the sky/moon" perspective, so then we invented heavier than air flight, which once again could not teach us about orbits, rockets, distances, or the vacuum of space.

What got us to the moon was rigorous analysis of reality with math to discover Newton's laws of motion, from which you can derive rockets, orbits, the insane scale of space, etc. No amount of further progress in planes, airships, kites, birds, anything on earth would ever have taught us the techniques to get to the moon. We had to analyze the form and nature of reality itself and derive an internally consistent model of that physical reality in order to understand anything about doing space.

nomel 894 days ago

> Climbing trees for millions of years taught us nothing about

Considering the chasm in the number of neurons between the apes and most other animals, I think one could claim that climbing those trees had some contribution to the ability to understand those things. ;) Navigating trees, at weight and speed, has a minimum intelligence reqiurement.

gpderetta 895 days ago

With enough thrust, even p̵i̵g̵s̵ trees can fly.

ActorNightly 895 days ago

We have made progress in efficiency, not functionality. Instead of searching google or stack overflow or any particular documentation, we just go to Chatgpt.

Information compression is cool, but I want actual AI.

danielmarkbruce 895 days ago

The idea that there has been no progress in functionality is silly.

Your whole brain might just be doing "information compression" by that analogy. An LLM is sort of learning concepts. Even Word2Vec "learned" than king - male + female = queen and that's a small model that's really just one part (not exact, but similar) of a transformer.

ActorNightly 894 days ago

Let me rephrase that.

One level deep information compression is cool, but I want actual AI.

Its true that our brains compress information, but we compress it in a much more complex manner, in the sense that we can not only recall stuff, but also execute a decision tree that often involves physical actions to find the answer we are looking for.

danielmarkbruce 894 days ago

An LLM isn't just recalling stuff. Brand new stuff, which it never saw in it's training, can come out.

The minute you take a token and turn it into an embedding, then start changing the numbers in that embedding based on other embeddings and learned weights, you are playing around with concepts.

As for executing a decision tree, ReAct or Tree of Thought or Graph of Thought is doing that. It might not be doing it as well as a human does, on certain tasks, but it's pretty darn amazing.

ActorNightly 894 days ago

>Brand new stuff, which it never saw in it's training, can come out.

Sort of. You can get LLMs to produce some new things, but these are statistical averages of existing information. Its kinda like a static "knowledge tree", where it can do some interpolation, but even then, its interpolation based on statistically occurring text.

danielmarkbruce 894 days ago

The interpolation isn't really based on statistically occurring text. It's based on statistically occurring concepts. A single token can have many meanings depending on context and many tokens can represent a concept depending on context. A (good) LLM is capturing that.

p1esk 895 days ago

Fascinating. What’s “actual AI”?

mdp2021 895 days ago

> What’s “actual AI”

Is Ibn Sina (Avicenna, year ~1000) fine?

> [the higher faculty proper of humans is] the primary function of a natural body possessing organs in so far as it commits acts of rational choice and deduction through opinion; and in so far as it perceives universal matters

Or, "Intelligence is the ability to reason, determining concepts".

(And a proper artificial such thing is something that does it well.)

homarp 893 days ago

It is a tool that has the ability to craft a prompt that will break current state of the art model.

It is a tool that can be given a project in language X and produce an idomatic port in language Y.

It is a tool that given a 20 pages paper spec will ask the questions needed to clarify the specs.

ActorNightly 894 days ago

Something that can reason and figure things out without having ever been exposed to the information during training.

p1esk 894 days ago

This either includes GPT-4 or excludes people

taneq 895 days ago

It’s whatever computers can’t do, dummy! :P