Hacker News new | ask | show | jobs
by nknight 5220 days ago
Seems to me they skipped right over the most obvious option: Redis.

It's quite fast, you can use a Redis string as a random-access array up to 512MB/each, and there are several good ways to handle persistence/backup. I don't think there was a need for them to write any C themselves.

2 comments

The 4th paragraph explained why they didn't go this way:

> We were reluctant to use a NoSQL solution as this would require retrieving the pixels through a socket, storing it in memory and then processing them. It makes more sense to process it where it’s stored.

Maybe I don't understand the problem,but that sounds like some serious premature optimization. 1.2 mpix is not much data.
According to the article their Node solution took 4 seconds to run (down from 7 seconds after some optimization) and their C solution 0.03 seconds. Now maybe they could have sped up their node code more, but those sort of improvements hardly count as premature optimization.
Since the usual expected slowdown for jit compiled scripts is somewhere on the order of 5 times (obviously, this is a very loose guess, and the number will vary by script, style, and workload), I wonder what they could have been doing to cause a 200x slowdown.
We were looking at that (as well as riak) but processing the data would require pulling all the data into PHP. I guess you could do the processing in C but it's then just as easy to store it there as well.
Have you looked into the LUA scripting option for Redis? Allows for some processing to happen on the server side, and it's quite powerful.
That sounds like a good option. Thanks, will note it.
I'm not clear why you're worried about that. Is it the pulling, or the processing?

The pulling shouldn't be an issue -- I don't know about PHP, but in pure Python, I can pull an arbitrary 10MB string from Redis in ~85-90ms. With hiredis (C extension), that falls to about 47ms.

I can't speak to processing, since I don't know exactly what transformations you're performing.

It's more the iteration of each pixel and it's neighbor (of which there are 8) making it around 9.6 million iterations.

We will probably head towards redis in the future when precise backups are essential. Undecided what will do this processing though.

We built GPU-accelerated NoSQL datastore. using it, this can be accelerated 100x, given you switch to binary pixel format.
Why would you use a GPU-accelerated storage when latency is the main goal?
GPU do not accelerate raw storage retrieval, but processing, like queries and map reduce.

Use APU / HPU, if PCIe latency is a problem.

I understood that they running something like convolution (I.e, each pixel calculated from surrounding pixels) - this will be fast using OpenCL model).