|
|
|
|
|
by skybrian
696 days ago
|
|
I’m a little skeptical of processes that seem to create more information than you had to start with. For a game like chess or Go, it makes sense, because winning strategies are implicit in the rules of the game, but it takes a lot of computation to discover the consequences. Similarly for math where theorems are non-obvious consequences of axioms. And computer code can be similar to math. But how does that work for an LLM in general? They’re trained on everybody’s opinions all at once, both right and wrong answers. They’re trained to generate text supporting all sides of every argument. What does more training on derived text actually do? |
|