Hacker News new | ask | show | jobs
by timhigins 405 days ago
> LLM could hallucinate

The job of any context retrieval system is to retrieve the relevant info for the task so the LLM doesn't hallucinate. Maybe build a benchmark based on less-known external libraries with test cases that can check the output is correct (or with a mocking layer to know that the LLM-generated code calls roughly the correct functions).

1 comments

Thanks for the feedback. This will be my next step. Personally I feel it's hard to design those test cases (by myself)