Hacker News new | ask | show | jobs
by boyter 2509 days ago
Image resizer scaling is one of the more interesting problems I have worked on in the last 10 years of so. I was part of a small team that designed and built the resizer that powers the nine.com.au network of sites. Modest by USA standards it gets close to hundreds of millions of views a day across the whole network.

We ended up using shared nothing architecture. The whole thing ran on 6 T2 large AWS instances using a slightly modified version of Thumbor where if you rotated the key the disk cache would be shared so avoid a large scale cache invalidation. It worked quite well and we rotated the key every few weeks.

Things I learnt.

Pretty much all image resizers have the same performance as all the good ones call out to C libraries in the end. Akamai (the CDN we used) despite having site-shield on would still hit the back-end ~100 times for the same image on occasion as I suspect all of the whitelisted machines could request the same image if their internal sharing didn't kick in fast enough.

Long tail images were the ones that brought the resizer to its knees. The hot images would quickly enter the local disk cache and were not an issue. Purge the whole cache though and the long tail images would quickly overwhelm the instances.

The last thing I learnt was to have a backup cloud-front ready to flip over to. At one point Akamai had issues and the resizer was facing origin load. It capped out at about 300 RPS which couldn't keep up with what was expected. It got even worse when the T2 instances ran out of credit. Spinning up cloud-front solved that issue once the DNS flip kicked in.

One good thing to come out of it was I helped write the C# thumbor library as we had one site that was using C# and nobody could move over to the new resizer without it.

3 comments

I once tried to pitch a one "mobile ecommerce website as a service" company in Vancouver to go for GPU based image rescaler at around 2011.

A very dumb proposal: no caching, resize on the fly, the gpu has many gigabits of resizing performance for as long as JPEG is involved. One GPU works in decoding with VDPAU, one in encoding with CUDA.

That knocked down any google app engine based "elastic" service on economic basis, but the catch is that you have to send that GPU resizer to every colo. That did not work out as with google app engine you were getting access for google's POP network for almost free, and they were already paying for gigantic amount of CDN traffic.

----

When I worked as sub-subcontractor for the Alibaba's RDMA wired DC project, there was one demo by another team where they got DSP devs involved and they got a 10GB/s JPEG transcoders for under 100W. I think, most of power budget was going to the FPGA that was linking it all with the NIC :/

An expensive toy, but it again demonstrated to me just how powerful is the "lockdown" power of all those "cloud" companies. You can not buy anything like this on the open market.

Imagine what it could've been if they offered something more cash worthy over the RDMA there.

I said long ago that the killed product during the Bitcoin boom was not the mining itself, but leasing and renting the rigs. Your capital costs get covered near instantly, and you can cash out the next week. I believe that all that "cloud" thing will eventually follow this path.

This is a (rambling) underrated pro comment and you should turn each of these little vignettes into blog posts.
Six t2.large instances sounds pretty efficient. For high volume image resizing on mobile devices we have access to GPU libraries. I wonder if something CUDA or openCL powered would help increase efficiency in a cloud based service.
A trick I've seen used at least one large site (feedly) use is piggybacking off Google's image serving infrastructure. Their ggpht/googleusercontent system gives you access to an image manipulation platform with more features than many open-source solutions (width, height, blur, rotate, frame, invert, etc). The only legitimate way to use it is through an application on their appengine platform, and I'm not sure why they don't offer it as part of the google cloud suite. Feedly seems to take the url in their appengine instance (seemingly dedicated solely to this, and redirects to a google URL which can then use the image manipulation features. Does anyone else here do something similar?

Edit: forgot to mention, the appengine documentation is very limited, and only mentions the ability for width/height resizing. Searching stackoverflow and other sites, however, reveals many other available modifiers

Edit 2: Also to mention is that the (ab)use of this service is quite popular with illegal sites. Who doesn't love offloading your image bandwidth to google's image proxies?