|
|
|
|
|
by nlrk
841 days ago
|
|
when I read the paper I thought the idea was changing \Delta permits getting the model to learn different things over different time scales. As you quoted “the main source of improvement". I don’t have an llm backround, just controls, so I might wrong. |
|