We were looking at that (as well as riak) but processing the data would require pulling all the data into PHP. I guess you could do the processing in C but it's then just as easy to store it there as well.
I'm not clear why you're worried about that. Is it the pulling, or the processing?
The pulling shouldn't be an issue -- I don't know about PHP, but in pure Python, I can pull an arbitrary 10MB string from Redis in ~85-90ms. With hiredis (C extension), that falls to about 47ms.
I can't speak to processing, since I don't know exactly what transformations you're performing.
GPU do not accelerate raw storage retrieval, but processing, like queries and map reduce.
Use APU / HPU, if PCIe latency is a problem.
I understood that they running something like convolution (I.e, each pixel calculated from surrounding pixels) - this will be fast using OpenCL model).