|
|
|
|
|
by brazzy
1161 days ago
|
|
>After that, we got to use reinforcement learning (agent GPTs with tools) to generate and self-validate more examples. How would you "self-validate" against hallucinated facts? What makes self-validation possible are hard external rules that can be evaluated independently and automatically. Like the rules of Chess or Go. We don't have anything like that for LLMs and what people want to use them for. |
|