|
|
|
|
|
by mtrovo
94 days ago
|
|
I think the main issue is treating LLM as a unrestrained black box, there's a reason nobody outside tech trust so blindly on LLMs. The only way to make LLMs useful for now is to restrain their hallucinations as much as possible with evals, and these evals need to be very clear about what are the goal you're optimizing for. See karpathy's work on the autoresearch agent and how it carry experiments, it might be useful for what you're doing. |
|
Man, I wish this was true. I know a bunch of non tech people who just trusts random shit that chatgpt made up.
I had an architect tell me "ask chatgpt" when I asked her the difference between two industrial standard measures :)
We had politicians share LLM crap, researchers doing papers with hallucinated citations..
It's not just tech people.