Hacker News new | ask | show | jobs
by manishsharan 1157 days ago
So what happens to the data from the PDF and the uploaded once I have stopped chatting with it ? A hard pass if you cant ensure the privacy of my data.
1 comments

right? But of course it HAS to save the pdf, otherwise how is it going to learn off it? The model can't possibly rely on ML processing only while the user has the file open.
I don't think that's an accurate mental model of how a tool like this works.

It's not training a new model on the PDF, or accumulating additional training into its existing model.

Instead, it basically copies and pastes relevant chunks of the PDF into the prompt (invisibly) and then pastes in your question.

It does use calculated embeddings in order to help it spot which are the most relevant sections to use, and it will store those (since they cost money in API calls to retrieve) - but it could be implemented to delete those stored embeddings and the PDF itself when the user stops interacting, or requests that the document is deleted.

FWIW I opened a new tab and uploaded a different PDF, then proceeded to ask it about the previous PDF.

It swears it has never heard of it or anything about the previous PDF (rather, it suggests you go search the web).

So, at least it doesn't seem to leak your upload to other users. But I wonder what it does with the info.