Hacker News new | ask | show | jobs
by BenGosub 373 days ago
Certainly, in those cases one needs to be clever and design an evaluation framework that will grade based on soft criteria, or maybe use user feedback. Still, over time a good train-test database should be built and leveraging dspy will do improvements even in those cases.