Hacker News new | ask | show | jobs
by Alex3917 938 days ago
His secretary may have typed them, but if they were mostly written by his secretary then Stanford wouldn't have paid to have them archived. Even though the messages are mostly redacted, actually going through and doing those redactions is still hundreds of hours of work.
1 comments

"if they were mostly written by his secretary then Stanford wouldn't have paid to have them archived." - that is straight-up incorrect. Archiving emails is very cheap. And the redactions look to have been done programmatically.
So ePadd redacts everything except the named entities, but that still means going through each message by hand to ensure that the entities generated by the NLP software are correct. Plus the time spent fixing ePadd to make the import run correctly with his non-standard email client, the time spent negotiating permissions and restrictions related to the collection, etc.

C.f.: https://github.com/search?q=repo%3AePADD%2Fepadd++knuth&type...