Hacker News new | ask | show | jobs
by amatic 714 days ago
This sounds amazing! Are there any metrics on how often different models pass tests? Has someone used a similar process to finetune an LLM?