Hacker News new | ask | show | jobs
by kdrag0n 113 days ago
what tasks can the model do out of the box? was each of the examples a different fine tuned model?
1 comments

it's a pretty general policy but this is all super early, it's great at exploring websites so fuzzing was easy, for CAD it has good enough base rates with the few-shot prompt when we do the repetitive stuff, and we gave it checkpoints on each step, the other stuff in the mosaic are just some of our favorite clips from internal evals