Hacker News new | ask | show | jobs
by gaelenh 4679 days ago
I wrote a similar image service (resize, crop, border, color analysis, format, and other fun features) a few years ago. It served about 6M image requests a day from 4 servers, plus a whole lot more from the CDN.

If you're using a CDN, just make sure they cache on query string parameters. Our resize controls were part of the URI, but the other commands were part of the QS. Our CDN stripped queries, so the first image would be cached and any calls with different QS would return the first hit.

A possible alternative that I never implemented was using the request header instead of the query string for additional commands.

Edit to add: Also, if you have lots of resized images on a page, be careful when Google or Bing scraped you. Your CPU and IO will go through the roof as your servers go crazy trying to dynamically generate all the images.

2 comments

Kind of strange the bots would spike you. Wouldn't anything on a page have been used before and thus cached on your server as a file? I've written all this stuff before as well, including overlaying site logo on images that were hotlinked to boot, but I always cached to the file system anything I had to generate the first time it was requested and simply served it after.
Short version: We had direct access to all the large news image providers and analyzed the images for topic and story identification. We used this data to dynamically generate hundreds of thousands of pages. Way too much data to cache to file (our storage cluster was many many terabytes). We used a CDN to cache the images. When a bot would scrape, it would hit all the old pages that fell out of the cache. Perhaps our fault for having such a large sitemap.xml.
So what happened to it? Are you still running it? If not, why not?
Not running (probably). We were acquired by a competitor. Last I heard, they upgrade their image server code with the bare essential features to migrate over our clients. A lot of good software was buried in that acquisition.