Hacker News new | ask | show | jobs
by bentobean 504 days ago
The remaining 1% is almost certainly where the good stuff is.
2 comments

Or just people's names. Something like "Joe Blow said he saw a person with a brown bag." You'd redact Joe's name even if there's nothing particularly interesting about him.
You'd redact a name, but not an entire file.
Depending on what remains, it may be possible to unblind the redacted names by considering the sum total of evidence. For example, these 10 people in the room all would have given testimony, but we only have nine statements with attached names. Who could this 10th unnamed persona be? Far easier to just keep the entire thing redacted.
With LLMs that even might be automated
Constraint solvers would be a better choice, IMO.

LLMs may help convert the text into a form for the constraint solvers, but they're not the tool I'd use for actually connecting the dots.

And if it turns out that there isn't anything interesting in that 1%, will you abandon this heuristic, and be more ready to accept that maybe the mundane explanation of 'they were kept classified because the people involved are still alive' is the norm for stuff like this?
This is the big question. I think Pompeo told Trump that the remaining 1% includes the names of US spies inside Havana and Moscow. How would you ensure their descendants are protected?