|
|
|
|
|
by shawntan
247 days ago
|
|
That analysis provided a very non-abrasive wording of their evaluation of HRM and its contributions. The comparison with a recursive / universal transformer on the same settings is telling. "These results suggest that the performance on ARC-AGI is not an effect of the HRM architecture. While it does provide a small benefit, a replacement baseline transformer in the HRM training pipeline achieves comparable performance." |
|