| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by codeviking 1739 days ago

We don't retain the uploaded document. We cache the extracted content, as to make things more efficient.

See https://papertohtml.org/about:

> What data do we keep? We cache a copy of the extracted content as well as the extracted images. This allows us to serve the results more quickly when a user uploads the same file again. We do not retain the uploaded files themselves. Cached content is never served to a user who has not provided the exact same document.

Also, we can delete the extracted data on request. Just send a note to accessibility@semanticscholar.org.

Sorry for the confusion!

1 comments

kahon65 1739 days ago

Ah okay, thank you.

>Also, we can delete the extracted data on request.

Just to be 100% clear, you are referring to the cached extracted data, right?

link

codeviking 1738 days ago

Yup, that's right.

link

kahon65 1738 days ago

Thank you very much!

link