Hacker News new | ask | show | jobs
by AndrewStephens 438 days ago
Why do you store your webserver's logs? My reading of the GDPR (I am not a lawyer) is that it strongly encourages site owners to store the very minimum amount of data about visitors - something that I wholeheartedly agree with.

Server logs are useful for debugging the site but also contain potentially identifying information (IP addresses) so I have my site delete them after 48 hours.

User submitted comments are obviously required for the usage of your site, so you are in the clear there.

1 comments

I read the logs with my human eyes manually because I am interested in learning about the web and internet. In fact today I found a whole new useful search engine because I saw it's spider in my logs.

    64.62.202.82 "GET /library/Math/Mathematical%20Methods%20for%20Physicists_%20A%20concise%20introduction_%20Tai%20L%20Chow_%202000.pdf -" "Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; Centurybot/1.0; +http://www.rightdao.com/bot.html) Chrome/131.0.0.0 Safari/537.36"
It turns out that http://www.rightdao.com/ is a great old-style search engine that actually returns many tens of pages and thousands of results. As opposed to google that only ever returns <400, bing <900, and kagi <200.

I guess I keep logs because I want to interact more directly with the internet as a whole and experience the serendipity that comes with that.

Then keep your logs for 14 days, and remove IPs from them after 48h.

Tools for that exist, you don't keep unnecessary data, and you're in the clear.