Hacker News new | ask | show | jobs
by hiq 1022 days ago
You can consider your best token predictor as a lossy compression of the corpus it was trained on.