Hacker News new | ask | show | jobs
State-space models can learn in-context by gradient descent (arxiv.org)
2 points by trextrex 604 days ago