| HN Mirror

Your reading is correct. Their primary aim was to compare some real lawyers' work with their wonderprompts to show off an upcoming service and how faster that is compared to humans. But they didn't have the funds to ask lawyers to review 15 or 50 or 1000 contracts, just 10, hence the cheat in methods and in conclusions.

This is not about drawing up a benchmark for an industry, let alone validating any useful method in general.

Ten docs reviewed based a single review 'playbook' (I think it's not in the paper, but probably max 20 questions per contract?) and compared across 3 different providers/roles + LLMs...