| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by banterfoil 2923 days ago

I'm writing a service that downloads content from Reddit. Usually I just want to see the top video of the week, or maybe the top 5 pics from /r/pics every day or something like that. The service has handlers for various kinds of URLs. For example, if it detects a youtube URL, it utilizes "youtube-dl", or if it's an image link it will use curl, if it is a text post, I might download it as a json object from the reddit API.

This serves two purposes for me: 1. I can be more specific about how I want to consume reddit. ie, I can fine tune parameters for each subreddit. 2. The download destination is directly to my NAS, so I can hoard it. (Bit of a datahoarder here)

It's been a fun learning experience identifying the diversity of URLs/Domains that are posted to reddit. Definitely a different perspective.