|
|
|
|
|
by johnsmith1840
369 days ago
|
|
AGI likely a combination of these two papers + something new likely along the lines of distillation. 1. Preventing collapse -> model gets "full"
https://arxiv.org/pdf/1612.00796 2. Forgetting causes better generalization
https://arxiv.org/abs/2307.01163 3. Unknow paper that connects this
- allow a "forgetting" model that improves generalization over time.
- I tried for a long time to make this but it's a bit difficult Fun implication is that if true this implies AGI will need "breaks" and likely need to consume non task content of high variety much like a person does. |
|