Hacker News new | ask | show | jobs
by TerrifiedMouse 937 days ago
> very least show clear signs of creativity

Do you know how that “creativity” is achieved? It’s done with a random number generator. Instead of having the LLM pick the absolute most likely next token, they have it select from a set of most likely next tokens - size of the set depends on “temperature”.

Set temperature to 0, and the LLM will talk in circles and not really say anything interesting. Set it too high and it will output nonsense.

The whole design of LLMs don’t seem very well thought out. Things are done a certain way not because it makes sense but because it seems to produce “impressive” results.

2 comments

I know that, but to me that statement isn't much more helpful than "modern AI is just matrix multiplication" or "human intelligence is just electric current through neurons".

Saying that it's done with a random number generator doesn't really explain the wonder of achieving meaningful creative output, as in being able to generate literature, for example.

> Set temperature to 0, and the LLM will talk in circles and not really say anything interesting. Set it too high and it will output nonsense.

Sounds like some people I know, at both extremes.

> The whole design of LLMs don’t seem very well thought out. Things are done a certain way not because it makes sense but because it seems to produce “impressive” results.

They have been designed and trained to solve natural language processing tasks, and are already outperforming humans on many of those tasks. The transformer architecture is extremely well thought out, based on extensive R&D. The attention mechanism is a brilliant design. Can you explain exactly which part of the transformer architecture is poorly designed?

> They have been designed and trained to solve natural language processing tasks

They aren’t really designed to do anything actually. LLMs are models of human languages - it’s literally in the name, Large Language Model .

https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-...

I’m sorry but I don’t trust something that uses a random number generator as part of its output generation.

> They aren’t really designed to do anything actually. LLMs are models of human languages - it’s literally in the name, Large Language Model .

No. And the article you linked to does not say that (because Wolfram is not an idiot).

Transformers are designed and trained specifically for solving NLP tasks.

> I’m sorry but I don’t trust something that uses a random number generator as part of its output generation.

The human brain also has stochastic behaviour.