Hacker News new | ask | show | jobs
by bastawhiz 122 days ago
I think it's equally likely that the property management company here has an incorrectly configured S3 bucket (or something like it) that has unintentionally exposed a bunch of leases. It makes more sense to me that a directory of hundreds or thousands of nearly-identical leases would be exposed online and scraped than the possibility that someone uploaded enough lease documents to Claude for them to all be included in training data. I'd be really surprised, actually, if any major AI company was taking uploaded documents and using them for training, since they're very, very likely to contain extremely sensitive data.