Umm, we're talking about just over 18MB here (1200 * 1000 pixels, 16 bytes/pixel, see http://pixenomics.tumblr.com/post/16895861678/how-to-send-1-...). That you can just dump over the wire as a binary blob. Why are we talking about this again? Use your favourite language, just keep it in a big blob in memory, and have fun.
Memory isn't an issue. It's processing the data and turning the storage into a format the client can read. A big blob isn't easy to send to the client unless it's an image or something and then it becomes an issue when you want to manipulate the data or process it.
Seems to me they skipped right over the most obvious option: Redis.
It's quite fast, you can use a Redis string as a random-access array up to 512MB/each, and there are several good ways to handle persistence/backup. I don't think there was a need for them to write any C themselves.
The 4th paragraph explained why they didn't go this way:
> We were reluctant to use a NoSQL solution as this would require retrieving the pixels through a socket, storing it in memory and then processing them. It makes more sense to process it where it’s stored.
According to the article their Node solution took 4 seconds to run (down from 7 seconds after some optimization) and their C solution 0.03 seconds. Now maybe they could have sped up their node code more, but those sort of improvements hardly count as premature optimization.
Since the usual expected slowdown for jit compiled scripts is somewhere on the order of 5 times (obviously, this is a very loose guess, and the number will vary by script, style, and workload), I wonder what they could have been doing to cause a 200x slowdown.
We were looking at that (as well as riak) but processing the data would require pulling all the data into PHP. I guess you could do the processing in C but it's then just as easy to store it there as well.
I'm not clear why you're worried about that. Is it the pulling, or the processing?
The pulling shouldn't be an issue -- I don't know about PHP, but in pure Python, I can pull an arbitrary 10MB string from Redis in ~85-90ms. With hiredis (C extension), that falls to about 47ms.
I can't speak to processing, since I don't know exactly what transformations you're performing.