|
|
|
|
|
by drdeca
850 days ago
|
|
Some sort of "generate descriptions of novel tasks including ways to evaluate performance at those tasks, evaluate quality of the generated tasks+evaluation-metrics, split tasks into subtasks, estimate difficulty of tasks in a way that is is judged on how it compares to a combined estimated difficulty of generated subtasks and to actual success rate and quality" sort of deal? |
|